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ABSTRACT 

We measure the linear power spectrum of mass density fluctuations at redshift z = 2.5 from the Lya 
forest absorption in a sample of 19 QSO spectra, using the method introduced by Croft et al. (1998). The 
P(k) measurement covers the range 27r/fc ~ 450 — 2350 km s _1 (2 — 12 comoving /i _1 Mpc for = 1), 
limited on the upper end by uncertainty in fitting the unabsorbed QSO continuum and on the lower 
end by finite spectral resolution (0.8 — 2.3A FWHM) and by non-linear dynamical effects. We examine 
a number of possible sources of systematic error and find none that are significant on these scales. In 
particular, we show that spatial variations in the UV background caused by the discreteness of the 
source population should have negligible effect on our P(k) measurement. We estimate statistical errors 
by dividing the data set into ten subsamples. The statistical uncertainty in the rms mass fluctuation 
amplitude, a oc \J P(k), is ~ 20%, and is dominated by the finite number of spectra in the sample. 
We obtain consistent P(k) measurements (with larger statistical uncertainties) from the high and low 
redshift halves of the data set, and from an entirely independent sample of nine QSO spectra with mean 
redshift z = 2.1. 

A power law fit to our results yields a logarithmic slope n — —2.25 ±0.18 and an amplitude A^(fc p ) = 
0.57^Q'^g, where is the contribution to the density variance from a unit interval of In A; and k p = 

0.008 (kms" 1 )" 1 . Direct comparison of our mass P(k) to the measured clustering of Lyman Break 
Galaxies shows that they are a highly biased population, with a bias factor b ~ 2 — 5. The slope of 
the linear P(k), never previously measured on these scales, is close to that predicted by models based 
on inflation and Cold Dark Matter (CDM). The P(k) amplitude is consistent with some scale- invariant, 
COBE-normalizcd CDM models (e.g., an open model with Qq = 0.4) and inconsistent with others (e.g., 
Q = 1). Even with limited dynamic range and substantial statistical uncertainty, a measurement of P(k) 
that has no unknown "bias factors" offers many opportunities for testing theories of structure formation 
and constraining cosmological parameters. 

Subject headings: Cosmology: observations, quasars: absorption lines, galaxies: formation, large scale 
structure of Universe 



1. INTRODUCTION 

Much of modern cosmology is based on the hypothesis 
that structure in our Universe arose from the action of 
gravity on small initial density perturbations. The power 
spectrum of these initial fluctuations, P(k), is a fundamen- 
tal prediction of different cosmological theories. Indeed, in 
the most common models, the initial Fourier amplitudes 
of the density are distributed in a Gaussian random fash- 
ion, and P(k) specifies the statistical properties of the ini- 
tial density distribution entirely. A determination of P(k) 
would therefore offer a direct way to test these theories, 
and to constrain any free parameters they might have. 
Also, and perhaps just as importantly, an unambiguous 
measurement of P(k) would serve as a valuable baseline 
for the interpretation of cosmological phenomena. Since 
the advent of Inflation, cosmological structure formation 
theorists have been blessed with something rare in other 
fields of astrophysics, well motivated and well specified 
initial conditions. Knowledge of P(k) would add tremen- 



dous extra power to quantitative studies of the formation 
of galaxies, clusters, and other structures. 

One route to P(k) uses observations of microwave back- 
ground anisotropies (the radiation counterpart to the ini- 
tial density fluctuations). However, estimates of the mass 
P(k) derived from such measurements depend on the as- 
sumed values of the cosmological parameters. Further- 
more, the most accurate measurements of microwave back- 
ground anisotropies are presently confined to very large 
scales. Much effort has therefore been spent on trying to 
infer P{k) from surveys of the galaxy distribution (see, 
e.g., Vogeley 1998 and references therein). Deriving an 
estimate of the primordial matter P(k) from galaxy mea- 
surements requires at the very least an understanding of 
how the present day distribution of galaxies is related to 
the primordial distribution of mass. This is essentially 
another definition of the commonly used term "theory of 
galaxy formation" , something which cosmology lacks at 
present in a quantitative enough form for this exercise to 
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be ca -ried out (see, e.g., KaufFmann et aL 1998a). Even fc Gnedin 1997 ) so that 



with such a theory, the complexity of the processes in- 
volved, such as gas dynamics, star formation, feedback, 
and non-linear gravitational collapse, promise to make it 
difficult to invert a theoretical relationship to directly re- 
cover P(k). 

Galaxies, however, are not the only potential probes of 
matter clustering. The Lya forest seen in quasar spec- 
tra (Lynds 1971; Sargent et al. 1980) can also be used to 
study mass fluctuations, but with two important differ- 
ences. First, the framework of standard cosmology has 
provided us with a well-motivated "theory of Lya for- 
est formation" , in which the bulk of Lya absorption at 
high z arises in a continuous, fluctu ating, and highly ion- 
ized intcrgalactic medium (se e, e.g. , |Bi & Davidsen 1997; 



Hui, Gnedin, fc Zhang 1997 ; Weinberg, Katz, & Hern 



quist 1998b; Rauch 1998 and references therein). Second 
the situation described by the theory is simple, and leads 
to the prediction that an approximately local relationship 
holds between the absorbed flux in a QSO spectrum and 
the underlying matter density, a relationship which can be 
inverted to learn about matter clustering. In particular, 
P(k) itself can be recovered over a limited range of scales, 
as shown by Croft et al. (1998, hereafter CWKH). Here we 
will apply the procedure of CWKH to recover P(k) from a 
moderately large sample of QSO spectra of the Lya forest. 

The modern picture of the Lya forest has arisen from 
theoretical studies of gas in the gravitational instability 
scenario for the formation of structure. This theoretical 
picture was originally proposed to explain observations of 
galaxy clustering and formation. It was then discovered 
that, when the effect of a background UV ionizing radia- 
tion field is included, the same theories naturally predict 
the existence of QSO absorption phenomena. These pre- 
dictio ns have been followed using semi-analytic t echniques 
(e.g.. McGill 1990t |Bi 1993|; |Bi, Gc. fc Fang, 1995|; iReiscncg-" 



ger fc |Miralda-Escudc 1995| [Bi fc Davidsen 1997| ; Hui et al 
1997), n umerical simula t ions of cosmological hydrodynam- 
ics (e.g., Ccn et al. 199-4 Izhang, Anninos , fc Norman 1995 



Hcrnquist et al. 1996 ; |Wadsley fc Bond 1996 ; Thcuns et 
al. 19 )^), and approximate N-body methods (e.g 
jean, Miicket, & Kates 1995 



petit 

|Gncdin fc Hui 1998j). The 



simulations and analytic models imply that the Lya forest 
arises primarily in diffuse gaseous structures of large phys- 
ical extent, consistent with the large trans verse coherence 



length found in paired QSO obser vations (Bcchtold et al 
1994f| pinshaw et al. 1994] 1995; |Crotts fc Fang 1998| ) 



The absorbing structures that dominate the Lya opacity 
at high redshift have gas densities fairly close to the cosmic 
mean, and they are still typically expanding with residual 
Hubble flow, so that the velocity width of absorption fea- 
tures seen in QSO spectra cor responds mainly to a ph ysical 
width (see the discussion in Weinberg et al. 1997a). The 
effect of thermal broadening is minor, so that the picture is 
qualitatively very different from previous representations 
of the Lya forest features as discrete clouds with a physical 
extent much smaller than their thermal profiles. 

The physical state of the gas is largely governed by the 
competing processes of photoionization heating by the UV 
background and adiabatic cooling due to the expansion of 
the Universe. This places most of the gas within a factor 
of 10 of the mean density on a power law temperature^ 
density relation (Katz, Weinberg & Hernquist 1996; 



T = T pl 



(1) 



where pb is the baryon overdensity in units of the cosmic 
mean. The parameters To and a depend on the reion- 
ization history of the Universe and on the spectral shape 
of the UV background. They are expected to lie in the 
ranges 4000 K £ T Q £ 15,000 K and 0.3 £ a £ 0.6 (Hui 
& Gnedin 1997). In the moderate and low density regions 
that produce the Lya forest, pressure gradients are small 
compared to gravitational forces, so that the gas tends to 
trace the structure of the dark matter and pb — p. The 
optical depth for Lya absorption is proportional to the 
neutral hydrogen density (Gunn & Peterson 1965), which 
for this gas in photoionization equilibrium is proportional 
to the density times the recombination rate. These propor- 
tionalities lead to a power law relationship between optical 
depth, r, and baryon density, pb- 
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with (3 = 2 — 0.7a in the range 1.6 — 1.8. Here T is the 
HI photoionization rate, H{z) is the Hubble constant at 
redshift z, h = Hq/(100 kms -1 Mpc - ), and pb is in 
units of the mean cosmic baryon density. As represen- 
tative fiducial values we have adopted the baryon density 
n b h 2 advocated by Buries & Tytler (1998), the Hubble ra- 
tio H(z)/Hq appropriate to an Q — 0.3, A = 0.7 universe 
at z — 2.5, the temperature To for mean density gas from 
the SPH simulation of Katz et al. (1996), and the pho- 
toionization rate L computed by Haardt & Madau (1996) 
at z ~ 2 — 3. Equation (0) is based on a hydrogen re- 
combination coefficient a(T) = 4.2 x 10- 13 (T/10 4 K)-°- 7 , 
which was adopted by Rauch et al. (1997) as a good ap- 
proximation to the recombination coefficient of Abel et al. 
(1997) in the temperature range that is most relevant for 
the Lya forest. Because equation (2) describes the analog 
of Gunn-Peterson absorption for a non-uniform, photoion- 
ized medium (ignoring the effect of peculiar velocities), we 
will refer to it as the Fluctuating Gunn-Peterson Approx- 
imation (FGPA, see Rauch et al. 1997; CWKH; Weinberg 
et al. 1998b). If we test the FGPA using artificial spec- 
tra extracted from simulations (see, e.g., Figure 6 of Croft 
et al. 1997), we find that there is some scatter in the rela- 
tion between transmitted flux (F = e~ T ) and gas density 
because the spectrum is measured in redshift space and be- 
cause thermal broadening, shock heating, collisional ion- 
ization, and other effects included in the simulations are 
not accounted for in the FGPA. However, the regions that 
exhibit a substantial deviation from this approximation 
only constitute a small fraction of the total length of the 
spectra. Any application of equation (2) should be tested 
on a case by case basis with simulations. We do not make 
explicit use of equation (2) in the P(k) recovery method, 
but we will frequently refer to it to provide physical moti- 
vation for our analysis. 

The principle behind the P(k) recovery method of 
CWKH is that the flux, F, measured from QSO spectra 
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can make effective use of observations with spectral reso- 
lution (FWHM) as poor as 2A, corresponding to a Gaus- 
sian dispersion a = 0.7A— 50 km s _1 at z = 2.5. The 
signal-to-noise ratio requirements are also not very strin- 
gent, basically because the Lya forest data is in the form 
of a continuous one-dimensional field, so that we do not 
suffer from the shot noise present in galaxy data. The 
errors that affect our determination of P(k) are mainly 
"cosmic variance" errors (more precisely, variations in the 
structure probed by a finite number of spectra), and the re- 
quirement is therefore for a data set that samples as many 
independent sightlines as possible. The signal-to-noise ra- 
tio and resolution do have a secondary effect in that they 
determine the accuracy with which the unabsorbed QSO 
continuum can be estimated. As explained in Section 3.1, 
uncertainties in this determination affect the measurement 
of P(k) on the largest scales. 

2.1. The QSO spectra 

The primary data sample used here represents a reason- 
able compromise between the needs for resolution, signal- 
to-noise ratio, and multiple sightlines. It is drawn from 
the survey of Damped Lya systems (hereafter DLA) by 
Pettini et al. (1994, 1997) and consists of 19 QSO spectra 
obtained over the period 1987 - 1994 with the William 
Hcrschel telescope on La Palma, Canary Islands and with 
the Anglo- Australian telescope at Siding Spring Observa- 
tory, Australia. The spectra are reproduced in Figure 1. 
The resolution ranges between 0.8 A and 2.3 A FWHM 
(typically ~ 1.5 A FWHM), and the signal-to- noise ratio 
is between ~ 10 and ~ 90 (typically S/N > 40). Further 
details of the data acquisition and reduction procedures 
can be found in Pettini et al. (1997). 

The QSO emission redshifts range from z cm = 3.23 
(Q0347-383) to z em = 2.084 (Q1331+170). The spectra, 
which were designed to straddle the wavelength of Lya 
in intervening DLA systems, thus cover different redshift 
ranges in the Lya forest. In Figure 1, the portion of each 
spectrum that was used in our analysis is shown inside a 
solid box, which marks the wavelength region between the 
QSO Lya and Ly/3 emission lines. It can be seen from the 
Figure that most of our sightlines sample a redshift range 
centered near z = 2.5. 

For our analysis we have constructed several subsamples 
of the data from this primary sample of QSO spectra, as 
follows: 

(1) A "fiducial" sample, containing all the data between 
z = 2 and z = 3. We will concentrate on this sample for 
most of our analysis. The restricted range of z is enforced 
so that the effects of redshift evolution are limited. The 
total length of Lya to Ly/3 regions in this sample (once 
it has been prepared as described in Section 2.2 below) is 
4.8 x 10 5 km s~ x , and the mean z — 2.5. 

(2) The full sample, containing all the data. The total 
length is 6.4 x 10 5 km s _1 , and the mean z = 2.4. We will 
split this sample into 10 different subsamples in order to 
estimate the errors on P(k) (see Section 3.1). 

(3) A low-z sample, for studying the effect of z evolution, 
consisting of all the data with z < 2.4. This sample has 

1 Here we have included a factor of 2ir that was omitted by error from the formula in CWKH. However, the tests with higher resolution PM 
simulations in Bertinn A helnw suggest that this cutoff scale may have been partially set by the resolution of the CWKH simulations (a point 
also made by Efaehnelt f998). 



constitutes a continuous, one-dimensional field whose rela- 
tion on a point-by-point basis to the underlying matter dis- 
tribution is governed approximately by equation (2). Ap- 
plying a monotonic mapping of the flux to give it a Gaus- 
sian probability distribution function converts a spectrum 
to a line-of-sight initial density field with arbitrary nor- 
malization. The one-dimensional power spectrum of this 
density field can be inverted to give the three-dimensional 
P(k). The amplitude of P(k) is set by running normaliz- 
ing simulations with different P(k) amplitudes (assuming 
Gaussian initial conditions) and picking the one for which 
the clustering of the flux in artificial spectra matches that 
in the observations. The value of the uncertain parameter 
A is determined in the normalizing simulations by match- 
ing an independent observation, the effective mean optical 
depth r c ff = — ln(e~ r ). It is this observational determina- 
tion of A that removes any dependence of the derived P(k) 
on unknown "bias factors" — the shape and amplitude of 
P(k) are both recovered. 

The rest of the paper is arranged as follows. The spectra 
of the Lya forest that constitute our observational data set 
are briefly described in Section 2. The bulk of the paper 
(Section 3) deals with the details of the P(k) recovery, in- 
cluding tests of the sensitivity of our results to continuum 
fitting and to the resolution of the simulations used to de- 
rive the normalization of P(k). In Section 4 we show that 
the artificial clustering that could be caused by fluctua- 
tions in the UV ionizing background, which in principle 
could bias our P(k) measurement, is in practice too small 
to be significant on the scales where we can measure P(k). 
In Section 5 we present a tabulation of our results and a 
power law fit to the data. We also compare our determi- 
nation of P(k) with the predictions of specific Cold Dark 
Matter (CDM) models and with recent measurements of 
galaxy clustering at z = 3 and z = 0. Finally, in Section 6 
we summarize our main results and outline directions for 
future work. As Sections 2.2 through 4 focus on techni- 
cal details of the application of the CWKH procedure and 
tests of its robustness, readers who are interested mainly 
in the final P(k) result and a discussion of it should skip 
ahead to Section 5 after reading Section 2.1. A brief sum- 
mary of the CWKH procedure is given at the beginning of 
Section 3. 

2. OBSERVATIONAL DATA 

An advantage of studying the properties of matter clus- 
tering on relatively large scales is that we do not necessar- 
ily need to use extremely high resolution or high signal- 
to-noise ratio (S/N) data. There will be a minimum scale 
below which the procedure for recovering the linear P(k) 
does not work because of the combined effects of peculiar 
velocities, thermal broadening, and non-linear evolution 
of the density field. The tests of CWKH on hydrody- 
namic simulations showed recovery of the correct linear 
P(k) on large scales but suppression of power on small 
scales, which could be approximately modeled by smooth- 
ing the linear P(k) with a Gaussian filter, of the form 
exp(-fc 2 r-2/2), with r s = 1.5/2tt ft" 1 Mpc (~ SOkms" 1 
at z = 3). 1 Information on smaller scales than this is 
therefore not directly useful to us at present, and so we 
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Fig. 1. — The 19 QSO spectra that constitute the main data sample used in this paper. The solid boxes drawn around parts of each 
spectrum represent the region from Lyo to Ly/3. Solid dots are drawn at the redshifts of the DLA systems. The regions around these are 
excluded from the analysis (see text). Vertical dotted lines are drawn at the wavelength of Lyo at z = 2 (left) and z = 3 (right). 



total length 3.2 x 10 5 km s _1 and mean z — 2.1. 

(4) A high-z sample, the data with z > 2.4, which has 

total length 3.2 x 10 5 km s _1 and mean z — 2.75. 

The data preparation procedure (described in Section 2.2 

below) removes regions around the DLA redshifts and near 

the QSO redshift prior to analysis of any of these samples. 

In addition to analyzing these data, we make use of a 
secondary, independent set of observations of the Lya for- 
est towards nine QSOs in a southern field, 40 arcminutes in 
diameter, centered at RA = 01 31 45 and Dec = —40 36 12 
(B1950). These data were obtained in November 1986 by 
M. Pettini and R. Buss at the Cassegrain focus of the 
Anglo-Australian telescope fed by the FOCAP multi-fiber 
system, and are reproduced in Figure 2. All the spectra 
cover the wavelength region 3400 - 4300 A with a resolu- 
tion of 2.2 A FWHM; the total exposure time of 56 000 s 
resulted in S/N ~ 9 — 35 (the QSO magnitudes range from 
B = 17.4 to 20.7). The mean redshift of the useful portions 
of these spectra is z — 2.1, conveniently the same as that of 
the low-z subsample of our primary data set described at 
point (3) above. We therefore decided to analyze this sec- 
ondary sample separately, rather than combining it with 
the main data set, so as to obtain an independent check 
on the results deduced from our primary sample. 

2.2. Data preparation 



Before applying the P(k) recovery machinery to the 
data, we need it to be in the correct form, having been 
continuum-fitted. There are also a few more ways the 
data should be processed. We describe our data prepara- 
tion below and test the effects of varying the parameter 
choices in Section 3.1. 

First, we find the unabsorbed continuum level in the 
data in an automated fashion. We use a standard iterative 
technique tested on simulations by Dave et al. (1997) and 
CWKH. The procedure is governed by one free parameter, 
Lfit, a length in A. We fit a third order polynomial to a 
region in the QSO spectrum of length 2Lg t . We then dis- 
card all points 2er below the fit line and fit again, iterating 
until convergence has been reached. The continuum level 
for the central Let part of this region is set by the final 
level of the polynomial. We then move Lfit/2 onwards in 
wavelength and fit the next portion of the spectrum, with 
the continuum fitted regions being joined together. We are 
therefore using buffer zones of length Lst/2 around each 
region. The buffer zones stop the continuum from curving 
downwards artificially if the Lgt region happens to end at 
a patch of high absorption. For our fiducial sample, we 
use Lfit = 50 A. 

Second, we prune the spectra to remove regions close to 
the QSO, which might be affected by its ionizing radia- 
tion (the proximity effect, see, e.g., Murdoch et al. 1986 
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Fig. 2. — The spectra of 9 QSOs in a field centered near 013145-403612. P(k) will be measured from this additional independent sample 
of data and used to check our results from the main sample (see text). The solid boxes drawn around parts of each spectrum represent the 
region from Lye? to Ly/3 . Vertical dotted lines are drawn at the wavelength of Lya at z = 2 (left) and z = 3 (right). 



Bajtlik, Duncan & Ostriker 1988). We also remove the 
DLA systems, because in the QSO spectra of the DLA 
survey they are obviously present with a higher number 
density than the cosmic mean, which is 0.2 ± 0.05 per unit 
z interval at z = 2.5 (Lanzetta et al. 1991). They are also 
caused by gas of much higher density than we expect to be 
described by equation (2). One might worry that by prun- 
ing the spectra we will somehow bias the clustering in our 
sample, as we are excluding high density regions. This is 
probably not the high densities correspond to sat- 

urated parts of the spectra and tend to be given relatively 
low weight in the clustering analysis anyway. It might also 
be thought that there is an opposite effect, whereby the 
extra clustering in the mass around DLA systems could 
bias the overall clustering level upwards. This is also ex- 
tremely unlikely, as the DLA systems have a high enough 
space density that any enhanced clustering due to each 
one can only extend over a tiny fraction of each spectrum. 
In any case, when we carry out the P(k) recovery, we will 
test the effect of excluding a large (100 A) region around 
the DLA systems, and also of not excluding them at all. 

Third, we attempt to mitigate the effects of evolution 
over the redshift range subtended by each individual sam- 
ple. The most noticeable effect of z evolution is the de- 
crease in the mean optical depth, which takes place as the 
Universe expands and the space density of hydrogen atoms 
decreases. In an Einstein-de Sitter Universe, the optical 
depth of photoionized gas evolves as r oc (1 + z) 4 5 owing 
to this effect. We follow Rauch et al. (1997) and CWKH in 
rescaling the fluxes in the spectra using this relation to the 
value they would have at the mean redshift of each sample 
(which is reasonable since all models are approximately 
Einstein-de Sitter at these redshifts). The spatial scales 
will also change due to the expansion of the Universe. To 
first order, we can correct for this by scaling all pixels to 
the size they would have in km s _1 at the mean redshift 
of the sample. In practice, this results in a constant scal- 



ing factor relating pixel sizes in A to km s _1 . There will 
be additional, second order effects due to the change in H 
over the redshift range, but these will be small, and model 
dependent, so we do not attempt to correct for them. 

Once we have treated the data as detailed above, we 
are left with a number of disjoint spectrum segments, of 
various lengths, because of the varying wavelength cover- 
age and because the spectra have been broken up by the 
exclusion of DLA systems. We discard all segments that 
are shorter than a certain length (we use 100 A), chosen 
to be at least a factor of 3 larger than the maximum scale 
on which we measure P(k), so that the effects of convolu- 
tion with the Fourier transform of the window function are 
negligible. The data preparation procedure is illustrated 
in Figure 3, which shows the continuum fit and excluded 
wavelength regions for one of the spectra in the primary 
data sample. 

3. RECOVERY OF THE POWER SPECTRUM 

The method we use for recovery of P(k) from QSO Lya 
forest spectra is described and tested in detail in CWKH. 
For completeness, we now give a brief account of the three 
principal steps in the procedure: 

(1) We convert the spectra to one-dimensional linear 
density fields, by mapping the flux values in pixels mono- 
tonically to give them a Gaussian probability distribution 
function (PDF) with arbitrary normalization. This "Gaus- 
sianization" procedure is motivated by the fact that grav- 
itational instability approximately preserves the rank or- 
der of (smoothed) densities (Weinberg 1992), so that one 
way of recovering the initial density field is to monotoni- 
cally map the final densities back to the initial PDF, here 
assumed to be Gaussian. As the transformation between 
flux and density given by the FGPA is also local and mono- 
tonic, mapping the PDF of the flux directly to a Gaussian 
yields an initial density field, to the extent that these ap- 
proximations hold. We note, however, that our results for 
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the shape of P(k) are insensitive to the precise nature of 
the transformation applied to the observed flux. For exam- 
ple, it was found in CWKH that the power spectrum of the 
flux itself has the same shape as the linear P(k). We have 
found by numerical experiments that any transformation 
of the density that suppresses the contribution of the high 
density regions (including Gaussianization, F = e~ Apl3 , 
or even truncation at p/p = 5) tends to produce a field 
whose power spectrum has the linear P{k) shape. Thus, 
Gaussianization is not indispensable to the P(k) recovery 
method, although it appears to be a useful way of "regu- 
larizing" spectra and thus reducing noise in the recovery 
(CWKH). In Section 5.4, we will briefly compare the shape 
of the primordial P(k) to the non-linear P(k) of the mass 
in simulations. 

(2) We measure Pm(k), the one-dimensional power 
spectrum of this density field, using a Fast Fourier Trans- 
form. We convert this Pir>(k) to the three-dimensional 
P{k) by differentiation (Kaiser & Peacock 1991; CWKH), 

Equation (3) assumes that the distribution of matter is 
isotropic with respect to the line of sight. Rcdshift-space 
distortions caused by peculiar velocities mean that this is 
not strictly true (Kaiser 1987), and these distortions must 
be taken into account for a truly accurate inversion of one- 
dimensional clustering (Hui 1998). We find in simulation 
tests (e.g., those in CWKH) that any error in the shape of 
the 3D P(k) caused by redshift-space distortions is small 
and well within the statistical errors for the present obser- 
vational determination of P(k), although it could have a 
noticeable effect in some future samples. In step (3) below, 
we use our measured P(k) shape as an input to the nor- 
malizing simulations. The P(k) that we use for this pur- 
pose corresponds to P(k) from equation (3) multiplied by 



exp(fc 2 r 2 /2), with r s = 34 km s , in order to compensate 
for power lost on small scales due to the finite resolution 
of the observations, as discussed in Section 2. However, 
we will only compare our recovered P(k) to theoretical 
predictions on scales where k < 0.5/r s . In CWKH it was 
shown that the recovered P(k) on these larger scales is in- 
sensitive to the details of the power restoration on smaller 
scales. 

(3) The P(k) resulting from step (2) is still of arbitrary 
amplitude. To determine the normalization, we use sim- 
ulations that have Gaussian initial conditions (i.e., ran- 
dom Fourier phases) and an initial power spectrum with 
the same shape as our measured P(k) (with small scale 
power restored as explained above) but with various lin- 
ear theory amplitudes. The higher the power spectrum 
amplitude, the larger the fluctuations in the evolved mass 
density field, and hence the larger the predicted fluctu- 
ations in the observed flux. We can therefore pick the 
correct P(k) amplitude by comparing clustering in spec- 
tra extracted from these simulations with the observations 
themselves. The statistic that we choose to make the com- 
parison with is the three-dimensional power spectrum of 
the flux (more precisely, the power spectrum of F/(F) — 1, 
where F is the ratio of the observed flux to the unabsorbed 
continuum). To distinguish this from P(k) of the mass in 
plots, we will plot 

A|(fc) = k 3 P F (k), (4) 

where Pp(fc) is the three-dimensional power spectrum of 
flux. The quantity Ap(fc)/27r 2 is the contribution to the 
variance of the flux from an interval d\nk = 1. We run the 
simulations using the PM approximation, where we use a 
standard PM N-body code to evolve the mass distribution 
and assume (a) that the gas pressure effects in the low and 
moderate density regions are unimportant, so that the gas 
traces the dark matter, and (b) that the gas follows the 




3400 3600 3800 4000 

wavelength (A) 

Fig. 3. — An example QSO spectrum (Q0049-283). Vertical dotted lines are drawn at the emission wavelength of Lyct (right) and Ly/3 
(left). We have also plotted the continuum fitted by our procedure (see text) with a fitting length La t = 50 A. The shading denotes regions 
excluded from the analysis. These are the regions blueward of Ly/3, redward of Lyct, within 50 A of either of the two DLA systems, and 
within 20 A of the QSO. 
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power law temperature-density relation discussed in Sec- 
tion 1. The CWKH tests show that the PM approxima- 
tion gives accurate predictions of the flux power spectrum 
relative to full hydrodynamic simulations. In making the 
artificial spectra from the normalizing simulations, there 
is one free parameter, in addition to the P(k) amplitude, 
that can also influence the amplitude of flux fluctuations: 
the parameter A of equation (2). Although it depends on 
physical quantities that are not known individually (such 
as fib, r, and To), A as a whole can be set by appealing to 
one observational measurement that we have not yet used, 
the mean flux level in the spectra. We therefore fix A in our 
normalizing simulations by picking the value for which the 
spectra have the same mean flux level as the observational 
measurements of Press, Rybicki & Schneider (1993, here- 
after PRS). The mean flux level, (F), is often expressed in 
terms of an effective optical depth, r e ff = — hx(F). Any 
uncertainty in the value of r e ff, and hence in A, is directly 
linked to uncertainty in the amplitude of P(k). Our choice 
of a particular observational determination of this quan- 
tity could therefore affect our results appreciably. We will 
discuss this issue further later in the paper. 

Given that the above procedure seems rather compli- 
cated, one could ask why we do not simply attempt a 
direct inversion of the flux to a mass distribution, using 
the FGPA as a guide, along the lines of the procedure pro- 
posed recently by Nusser & Haehnelt (1998). For purposes 
of determining the primordial P(k), our more indirect pro- 
cedure is more robust and more broadly applicable, for sev- 
eral reasons. First, our approach is relatively insensitive to 
what is occurring in saturated regions, which cannot be in- 
verted directly from observations of Lya absorption alone, 
and which in any case are less likely to obey the FGPA. 
Second, our method relies mainly on large scale clustering 
information; it therefore does not require data that fully 
resolve all Lya features. Finally, the use of simulations in 
the normalizing procedure provides a convenient way to 
estimate the unknown parameter A, and it automatically 
includes the effects of non-linear gravitational evolution 
and peculiar velocities. 

In our method of deriving P(k), the assumption that pri- 
mordial fluctuations are Gaussian enters mainly into the 
normalization step (3), since we use Gaussian fluctuations 
to initialize our normalizing simulations. If we adopted 
non-Gaussian initial conditions with the same P(k) shape, 
then the P(k) amplitude required in order to match the 
observed flux power spectrum with the observed r e ff as a 
constraint might be different. The Gaussian assumption 
also motivates the Gaussianization procedure applied in 
step (1), but since the derived shape of P(k) would be 
similar even without Gaussianization, it seems likely that 
recovery of the shape of P{k) does not depend much on 
the assumption of Gaussian initial conditions. However, 
all of the tests in CWKH and in this paper are for initially 
Gaussian models, and the success of our method in recov- 
ering the shape and amplitude of P(k) in non- Gaussian 
models would need to be tested on a case-by-case basis. 

3.1. The shape of P(k) 

We now turn to the analysis of the observational data. 
First, we measure the shape of P(k) as described in steps 
(1) and (2) above, and also measure A F (fc). In order to 
estimate errors, we take the whole data set [sample (2) 



of Section 2.1] and split it into 10 subsamples, of roughly 
equal length. We estimate P(k) and Ap(fc) individually for 
each of the subsamples; the results are plotted as points in 
Figure ^. When Gaussianizing the flux to yield an initial 
density field, we set the a of the Gaussian PDF to be the 
same for each of the subsamples. For several of the sub- 
samples, the values of P(k) and Ap(fc) for the largest scale 
plotted in Figure^ are unphysically negative, as are a few 
measurements on smaller scales. This can occur when the 
measured one-dimensional power spectrum is noisy, as the 
noise may result in regions with a positive slope, gr-PiD (k), 
so that the inversion of equation (3) yields negative val- 
ues for P(k). The largest scale point plotted marks the 
limit where cosmic variance noise is small enough for this 
sample to allow us to make a reasonable inversion from 
one-dimensional to three-dimensional clustering. We will 
see later that the real maximum scale on which we can be- 
lieve the P{k) measurement appears to be slightly smaller, 
and is set by continuum fitting. 

The solid lines in Figure [| are the P(k) and Ap(fc) 
measurements from the fiducial sample [sample (1) of Sec- 
tion 2.1], on which we will base most of our analysis. We 
will assign error bars to these measurements that are de- 
rived from the fractional error in the mean of the measure- 
ments from the 10 subsamples of the full data set described 
above. We base our error estimate on the variance among 
subsamples of the full data set rather than the smaller, 
z-limited, fiducial data set for two reasons: the inversion 
from ID to 3D clustering is more manageable with the 
larger subsamples, and the errors based on a data set with 
a larger range in z should be conservative, as there will be 
extra variance introduced by the larger z evolution. We 
therefore assign fractional errors from the full sample to 
other samples, allowing for the difference in the number 
of independent data elements in each sample by scaling 
the fractional errors by the ratio of the square roots of the 
sample lengths. 

We can see from Figure || that there is significant varia- 
tion between the results for the different subsamples. Each 
subsample corresponds to roughly the length of one full 
Lya to Ly/3 region in a spectrum. The errors increase to- 
wards large scales because we are averaging over fewer in- 
dependent modes. On small scales, we see a turnover, due 
to the finite observational resolution of our data sample. 
The lowest resolution data that forms part of our sample 
has a FWHM resolution of 2.3 A. As discussed in Sec- 
tion 2, this is similar to the smallest scale for which the 
simulation tests of CWKH verified that the linear P(k) 
can be correctly recovered. Because of this limitation, 
we should only regard our results on scales larger than 
k ~ 0.02(km s -1 ) -1 as being representative of the true 
shape of P{k). 

The data preparation procedure described in Section 
2.2 involves several operational parameters, the choice of 
which could conceivably affect our results. One of these 
is the length Lgt over which the continuum is fitted. In 
Figure || we test the effect of using different values of Lgt • 
Again we plot both P(k) and A|(fc), this time for the fidu- 
cial sample. The error bars have been determined in the 
manner explained previously. 

The smallest L^ t we try, 25 A, is obviously too small, 
being similar in size to the largest wavelength on which 
we measure P(k). We try this value as an experiment, 
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Fig. 4. — (a) P(k) for the Gaussianized flux. The solid line is measured from the spectral regions between z = 2 and z = 3 (the fiducial 
sample). The dots were made by dividing the full sample into ten separate pieces and calculating P(k) individually for each one. Open circles 
are plotted at the modulus of negative data points, (b) The 3D flux power spectrum, Ap(fe) = k 3 Pp(k), for the same samples plotted in (a). 



to see how this poor choice of L n t will affect our mea- 
surements. By comparing panels (a) and (b) of Figure |[ 
we can see that the effects are different for P(k) and for 
A|(fc). On scales k < 4 x 10~ 3 ( km s -1 ) -1 and larger, 
the continuum fitting has completely eliminated power in 
P(k), but A|(fc) has increased. This difference may re- 
flect the fact that the Gaussianized field used to measure 
P{k) has more prominent low density regions, as the Gaus- 
sianization stretches out the PDF of low densities into a 
Gaussian tail. These low density regions, being closer to 
the continuum, may be more influenced by fitting. When 
we use more reasonable values for Lfn , including our fidu- 
cial value of 50 A, we can see that the two largest scale 
points are affected by the choice of fitting length, and for 



k < 2 x 10~ 3 (km s -1 ) -1 the systematic variation is out- 
side the statistical errors. We will therefore discard the 
largest scale point when making use of our results. It is 
interesting that increasing Lst appears to yield less power, 
although it is not certain whether this represents a trend 
or merely a statistical fluctuation. 

One might worry that with our relatively low spectral 
resolution we will fit the continuum systematically low ev- 
erywhere. One way of checking to see if this is a prob- 
lem is to change the clipping level below which points are 
discarded during the fitting process. The usual value is 
2a (where a is the error on the flux at a point), but if 
we change it to la we tend to fit the continuum much 
higher, almost certainly too high. The mean effective op- 
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Fig. 5. — Tests of the effect of changing the length over which the continuum is fitted, Lflt- See Section 2.2 for details of the fitting 
procedure, (a) The Gaussianized P(k). (b) The 3D flux power spectrum, Aji(fc). Error bars are derived from the error on the mean taken 
from splitting the sample into 10 subsamples. 
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tical depth of the sample, T e g, increases by 25% to 0.30, 
but this change has no direct impact because we use the 
PRS measurement of r e ff to fix the value of A, not the 
value from the sample itself. What is important is that 
P(k) and Af (fe) hardly change at all, as shown by the 
open squares in Figure ||. We have also tried raising the 
continuum uniformly everywhere by 10%, and we again 
find that this has a negligible effect on our results. 

As we will see by considering other potential factors, 
the continuum fitting process appears to act as a limit 
to the largest scales on which we can measure P(k) from 
the current data set. It may be that data with higher 
spectral resolution would allow more accurate continuum 
fitting and hence enable measurement of P{k) at larger 
scales. In future work we plan to carry out a systematic 
analysis of continuum fitting procedures using larger vol- 
ume simulations (for which the true continuum is known). 
Such analysis might suggest better ways of determining 
the continuum, perhaps involving a totally different tech- 
nique (see, e.g., PRS), and it would help us understand 
the interplay of spectral resolution, signal-to-noise ratio, 
and continuum fitting in limiting the accuracy and dy- 
namic range of P(k) recovery. Continuum fitting also has 
an important impact on other statistical measurements of 
the Lya forest, s uch as t c s and the flux decrement distri- 
bution function (Rauch ct aL 1997). Although a detailed 
investigation of these issues is beyond the scope of this 
paper, we can already surmise from consideration of Fig- 
ure H that our P(k) measurement is likely to be reliable 
out to a wavelength A ~ 2300 km s _1 , which for fig = 1 
corresponds to a comoving scale ~ 12 h~ 1 Mpc. On scales 
smaller than this, reasonable variations in the continuum 
fitting procedure have no significant influence on our de- 
rived P(k). 

In Figure ^ we vary the other parameters used in the 
data preparation of Section 2.2 and compare with results 
for the fiducial parameter choices. We can see that changes 
such as not scaling the optical depths to the mean redshift, 
or increasing the QSO proximity gap to 100 A from the 
fiducial value of 20 A, cause only minor changes to P(k), 
within the la errors. Even keeping the DLA systems as 
part of the analyzed portion of the spectra, obviously not 
a reasonable thing to do, does not change the results by 
much; there is a small change in P(k) and A|(fc) near 
k ~ 0.05( kms -1 ) -1 , which is probably the signature of 
power on the scale of the DLA systems themselves. These 
tests therefore increase our confidence in the robustness 
of the measurements of P(k) and A|(fc). For larger, fu- 
ture samples, different treatments of the data may yield 
systematic differences of results that rival the statistical 
errors. This does not appear to be the case here, indicat- 
ing that the procedures we have adopted are adequate for 
our current data and objectives. 

3.2. The normalization of P(k) 

As outlined previously, we use simulations to normalize 
our estimate of P(k). The P(k) we use as an input to the 
normalizing simulations is slightly different from the mea- 
sured P(k) plotted in Figures || and ^, in that we restore 
small scale power that was suppressed by the limited obser- 
vational resolution. In CWKH, it was shown that missing 
power on small scales only has a small effect on A|(fc) at 
the large scales we use for normalization (see Figure 9 of 



CWKH) . We therefore do not need to make this correction 
for lost small scale power very precisely. As described at 
the beginning of Section 3, we "unsmooth" P(k) using a 
Gaussian filter, so that 



P s (k)=P(k) xe k2r *'\ 



(5) 



where r s — 34kms~ 1 and Ps(k) is the power spectrum 
used in the normalizing simulations. We also extrapolate 
P(k) above the largest measured point using an n = — 1 
power law. 

The PM simulations have 128 3 particles on a 256 3 grid in 
a periodic box 4170 km s _1 on a side. We assume S7o = 1 
and Ao = (and Hq = 50 km s _1 Mpc -1 ) when running 
the simulations, so that the box size is 22.22 h~ 1 Mpc co- 
moving. CWKH have shown that the choice of cosmolog- 
ical parameters has a negligible effect on the results pro- 
vided that one works in the observed km s _1 units. We 
choose Hq = 1 for the simplifying reason that we can use 
different outputs of a single simulation to represent differ- 
ent mass fluctuation amplitudes, since = 1 at all red- 
shifts and the linear theory fluctuation amplitude is pro- 
portional to the expansion factor, a(t). The initial density 
field is set up using Ps(k) and Gaussian random phases. 
We average results from 4 different realizations that use 
different random seeds. The simulations are run so that 
the expansion factor, a, increases by a factor of 16.8 from 
the initial conditions to the most evolved output, in 84 
equal steps of Aa = 0.2. 

We extract spectra from the simulations, for several 
different output times, using the methods described in 
CWKH. We use a temperature-density relation of the form 
given by equation (1), with T = 5600K and a = 0.6. We 
adjust the mean effective optical depth T e s (by varying 
fif/r) so that Toff = 0.28, the PRS value at z = 2.5. We 
extract 2000 spectra in total at each output time and cal- 
culate A|(fc) from them. The results are plotted in Figure 
|7[ where the curves for different output times are labeled 
with the expansion factor a, and a = 1 has been chosen 
to correspond to the normalization appropriate for the ob- 
servational results (which we shall describe below). From 
Figure [j], we can see that changing the underlying ampli- 
tude of mass fluctuations results in a clear change in A|(fc) 
(recall that r e ff is the same for all spectra). We will re- 
strict our quantitative use of the data to large scales, with 
k < 0.02(km s -1 )^ 1 , which in practice means using the 
2nd through the 8th observational points (we discard the 
first point because of continuum fitting uncertainties). On 
smaller scales we do not know the initial P(k) accurately, 
and the predicted Ap(fc) depends on physical assumptions 
and the resolution of the simulations. 

To determine the normalization of P(k), we must decide 
which of the A|(fc) curves in Figure |t] (or interpolation be- 
tween these curves) is closest to the observational results. 
One possible method for determining the correct normal- 
ization involves a maximum likelihood fit of the simulation 
results to the observational data, where we seek to maxi- 
mize the likelihood by minimizing 

x 2 = ^[A|,(fc l )-A|, sim (fc l ,a)]^ 1 [A 2 ,(fc,)-A 2 , sim (fc„a); 

(6) 

Here A|,(fcj) is the observed value of A|(fc) at k = fc, 
and A|, sim (fcj, a) is the equivalent quantity measured from 
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Fig. 6. — Tests of the data preparation procedure. The points labeled "fiducial" are for z = 2 to z = 3 and have the continuum fitted over 
a 50A region, have (1 + z) 4,5 scaling of r, a 20 A proximity gap, and a 100 A region excluded around DLA systems. The other points show 
the effect of varying these parameters, (a) The Gaussianized P(k). (b) The 3D flux power spectrum, A|(fc). 



simulation outputs at expansion factor a. The covariance 
matrix of the data points, C, is estimated using the ob- 
servational data split into subsamples. If we use this pro- 
cedure, we obtain a normalization of P(k) with errors of 
(+10%, —9.5%). We have also applied this procedure using 
a jackknife estimator to determine C from the ten subsam- 
ples, with very similar results. 

Despite its statistical logic, we have decided not to adopt 
the maximum likelihood determination of the normaliza- 



tion and error but instead to rely on a simpler estimator. 
There are two main reasons for this. First, we find that 
the above procedure yields an unrealistically low value of 
X 2 for the best fitting output. The low x 2 arises because 
the P(k) used in the simulations is measured from the ob- 
servational data themselves, so that the shape of A|(fc) in 
the simulations is more correlated with the observations 
than the cosmic variance error bars suggest. Second, and 
probably more important, the points on small scales, with 
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small statistical errors, are weighted most highly in the 
maximum likelihood fit. These points are also those most 
likely to be affected by systematic errors in the simulations 
resulting from resolution effects (see below) or uncertainty 
in the input physical assumptions (see CWKH). We there- 
fore adopt an estimator that depends more evenly on the 
data points and that condenses the information about the 
amplitude of P(k) into one number, 



S = J2 A 



(7) 



The sum is over the 2nd to the 8th data points, as 
explained above. Because the variance of the flux is 
J °° Ap(k)d\nk/2TT 2 and our data points are evenly spaced 
in Ink, the quantity S is simply the contribution to the flux 
variance from the wavenumber range over which we esti- 
mate the power spectrum. Although this estimator does 
not weight the data optimally in a strictly statistical sense, 
it is less sensitive to systematic errors on small scales, and 
by using it we arrive at a conservative estimate of the P(k) 
normalization error. 

The observed value of S is our diagnostic for the ampli- 
tude of P{k) — the higher the amplitude, the larger the 
value of S predicted by the normalizing simulations. We 
choose the best fit amplitude to be the one for which the 
predicted S matches the observed value (using linear in- 
terpolation between the two closest simulation outputs). 
We obtain the la uncertainty by measuring S separately 
for each of the 10 subsamples of the full data set [sample 
(2) in Section 2.2], converting the la error on the mean 
of S into a corresponding uncertainty in the mass fluctua- 
tion amplitude. Since the relationship between S and the 
amplitude is fairly linear, we would get similar results if 
we instead determined the amplitude separately for each 
subsample and took the la error on the mean amplitude. 
Normalization errors for subsets of the data, such as the 
fiducial sample and the other samples of Section 2.2, come 
from scaling the errors on S by the ratio of the square 
roots of the lengths of the spectra involved, as was done 
with the errors on the individual P(k) points. 

After applying this procedure, we find the Ha uncer- 
tainty on the normalization of the fiducial (z = 2 — 3) 
sample to be (+17.0%, —16.5%) in the fluctuation ampli- 
tude a. The normalization itself is 15% higher in a than 
that which results from applying the maximum likelihood 
fit of equation (7). 

An important additional source of error is uncertainty 
in the value of r e ff. The value we use is given by the PRS 
formula r cff = 0.0037(1 + z) 3 46 . PRS measured their re- 
sult from a sample of 29 low resolution QSO spectra. They 
estimated the continuum in the Lya forest region by ex- 
trapolating the continuum observed on the red side of the 
Lya emission line. The results are consistent with those 
measured from high resolution Keck spectra (Rauch et al. 
1997) using a polynomial continuum fitting technique blue- 
ward of Lya. The smaller Keck sample has larger statisti- 
cal errors, but its consistency with the PRS result makes 
us reasonably confident that the value of r e ff we use is 
close to the true one. However, we note that discrepant, 
lower results for r r ff h a ve been published by other au thors 



the issue is not settled. 



(e.g., |Zuo fc Lu 1993| ; [Dobrzycki fc Bechtold 1996[ ), and 



We quantify the influence of r e ff on the P(k) amplitude 
by making new spectra from our normalizing simulations, 
with different values of T e g. We then carry out our nor- 
malizing procedure using the weighted sum of equation (7) 
to find the best fitting value of a for the observations us- 
ing the new spectra. The results are shown in Figure |[ 
Increasing r e g for a given amplitude of mass fluctuations 
increases the fluctuations in Ap(fc), since it requires us to 
choose a larger value of A in equation (2). As a result, we 
find a lower value for a. The la error that PRS give on 
their r e ff measurement corresponds to 4% at z — 2.5. This 
can be translated directly to an error in a of (+12%, —9%), 
as shown in Figure ^. 

In order to combine these two contributions to the nor- 
malization uncertainty, we first calculate the likelihood dis- 
tribution for the amplitude of mass fluctuations for each of 
the sources of error taken individually, assuming that the 
errors on r e ff and on the weighted sum S of equation (7) 
are each Gaussian distributed. We then convolve the like- 
lihood distributions and find the total uncertainty, which 
is (+20%,— 17%) in the amplitude of mass fluctuations 
and (+45%, —31%) in P(k). The combination of errors is 
described in more detail in Section 5.1. 

The r e ff error is smaller than the main source of error, 
but it is nonetheless important. It would be worth in- 
vestigating the measurement of r e ff in detail, as a more 
accurate measurement is critical to obtaining more accu- 
rate determinations of P(k) using larger samples of QSO 
data. As r c g in our approach sets the value of the pa- 
rameter A in equation (2), it determines how well we can 
measure the level of "bias" between r and the mass fluctu- 
ations. Measurements of T c g are also crucial for constrain- 
ing the parameters that are subsumed into A, such as 
(see Rauch et al. 1997; Weinberg et al. 1997b). 

Although we do not use Ap(fc) information on small 
scales in our normalization of P(k), we might also expect 
the resolution of the simulations to have some effect on 
A|(fc) on large scales. For example, if the normalizing 
simulations are of insufficient resolution, small scale fluc- 
tuations that should be near saturation, or at least away 
from the linear part of the curve of growth, might instead 
be smoothed out, and therefore contribute more to r e ff. 
The interplay between r e ff and the amplitude of mass fluc- 
tuations described above would then lead to a systematic 
offset in the normalization. To check that our normal- 
izing simulations have sufficient resolution, we have run 
some simulations with higher resolution and some with 
lower resolution. Figure ||a shows results for the low reso- 
lution runs, which use the same phases and box size as the 
original simulations, but have only 64 3 particles instead 
of 128 3 . The mean interparticle separation, listed on the 
plot legend, is therefore a factor of two larger. For two of 
the plotted output times, this lowering of resolution has 
increased Af(fc) systematically on large scales. Normal- 
izing P(k) using these low resolution simulations would 
result in a mass fluctuation amplitude ~ 20% lower. The 
effect on Ap(/c) at smaller scales, k > 0.015(km s -1 ) -1 , is 
much stronger, since this is the regime where the lowered 
resolution comes directly into play, but these scales do not 
enter into our normalization procedure. 

Figure [J) compares results at our standard resolution 
to results at higher resolution. Here we have only run 
one realization for each of the two resolutions, with iden- 
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Fig. 8. — The effect of varying T c ft in the normalizing simulations. The dashed lines show the la errors on the value of T e g from PRS and 
how they correspond to an uncertainty in a. The errors are ~ ±4% on r e ff and (+12%, —9%) on a. 
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Fig. 9. — The effect of varying resolution in the normalizing simulations. We plot the power spectrum of the flux for a few different output 
times, (a) 4170 km s _1 (22.22 h~ 1 Wlpc for f2o = 1) box simulations, with the same phases, but with different mean interparticle separations 
dp. The results shown in this panel are an average of 4 realizations for each resolution, (b) Results from two single simulations (same phases) 
in a 2085 km s^ 1 (11.111 /\~ 1 Mpc for Qq = 1) box with different interparticle separations. 



tical phases, in a box of side length 11.11 h~ 1 Mpc. As 
the phases are different from the panel (a) runs, and the 
cosmic variance errors are large (the volume of space sim- 
ulated is 1/32 of that in panel [a]), we cannot compare 
panels (a) and (b) directly. We can compare the dotted 
curves, which have the same resolution as the original sim- 
ulations, to the solid curves, which show the effect of in- 
creasing the spatial resolution by a factor of two. This 
time there is no systematic offset between the two, so it 
appears that our original simulations have sufficient res- 
olution. The standard normalizing simulations have the 
same mass resolution as the SPH simulations analyzed in 



CWKH (but lower gravitational force resolution), and each 
has eight times the volume. There is a systematic differ- 
ence between the standard and high resolution simulations 
of Figure ^b at high k, suggesting that the depression of 
the small scale P(k) found in the CWKH tests is caused 
at least in part by the finite mass resolution of the SPH 
simulations. 

4. THE EFFECT OF FLUCTUATIONS IN THE IONIZING 
BACKGROUND 

Before examining and discussing our P(k) results in 
more detail, we investigate another potential source of sys- 
tematic error, clustering in the flux caused by fluctuations 
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in the ionizing background. If the ionizing background 
is not uniform, as we have assumed, but instead exhibits 
substantial inhomogeneities, then fluctuations in r will be 
caused by fluctuations in the spatially varying value of T 
in equation (2), as well as by fluctuations in the mass den- 
sity. The UV background (UVBG) is produced mainly 
by discrete sources, such as QSOs and starburst galaxies. 
Whether this discreteness has an important effect on our 
P(k) determination depends on the scale and amplitude of 
the clustering induced by the non- uniformity of the UVBG 
compared to that produced by intrinsic clustering in the 
mass. By claiming that we are able to measure P(k) for 
the mass, we are effectively assuming that the UVBG is 
uniform on the scales < 10 /i _1 Mpc that we can access 
with our current observational data. 

Previous work on this issue has examined the effect of 
UVBG fluctuations on randomly distributed Lya clouds 
(Zuo 1992; Fardal & Shull 1993). In this paper we simu- 
late the fluctuations caused in uniformly distributed gas, 
using the FGPA, and also the effect of modulating ob- 
served QSO spectra with additional UVBG fluctuations 
derived from simulations. The case for which we expect 
there to be the largest fluctuations is a UVBG entirely gen- 
erated by QSOs, which have a very low space density. We 
will therefore deal with this case first and in most detail. 

Our UVBG simulations are set up in a universe with 
H = SOkms" 1 Mpc -1 , n = 0.2, and A = 0, and 
a box of size 370 proper Mpc at z = 2.5 (which cor- 
responds to 79310 km s _1 ). We populate this box with 
QSOs, using luminosities drawn from the luminosity func- 
tion of Haardt & Madau (1996), with a lower cutoff at 
Mb = —23. We try simulating both Poisson distributed 
and clustered QSO distributions. The clustered QSO po- 
sitions are chosen by first generating a Gaussian linear 
density field in the box using a power spectrum appropri- 
ate for a H = 50 km s" 1 Mpc" 1 , n = 0.2 CDM model 
(taken from Efstathiou, Bond, & White 1992). We select 
all regions in this field that have a density above a cer- 
tain threshold and populate them randomly with QSOs. 
As shown by Kaiser (1984), thresholding produces a dis- 
tribution of QSOs that are clustered more strongly than 
the underlying mass. We choose the threshold height so 
that the scale at which the autocorrelation function of the 
QSOs is unity is ro — 10 /i _1 Mpc (comoving). 

The factor that most strongly influences the level of 
UVBG fluctuations is the attenuation length of the QSO 
flux. At high redshifts, the intergalactic medium has a 
substantial optical depth to ionizing photons. Fardal & 
Shull (1993) recommend parameterizing the attenuation 
of ionizing radiation by intergalactic gas with an atten- 
uation length, r at t, defined so that radiation reaching a 
distance r from a source is attenuated on average by a fac- 
tor e -( r /™\ Fardal & Shull (1993) and Haardt & Madau 
(1996) estimate that r a tt — 100 proper Mpc (for h = 0.5) 
at z = 2.5. The attenuation length rises rapidly with in- 
creasing redshift, as the Universe becomes more optically 
thick. We will try using both r a tt = 100 Mpc and r a tt = 50 
Mpc in our simulations. 

To generate spectra from our UVBG simulations, we 
randomly select lines of sight through the box and calcu- 
late the intensity of UV radiation at each point along them, 
summing the contributions of all the QSOs in the box. Wc 
assume Euclidean space, which should be a good approxi- 



mation at high redshift, and periodic boundary conditions. 
We therefore apply an inverse-square law to the radiation, 
which is additionally attenuated according to the attenu- 
ation law described above. Because of the finite box size, 
we cut off the flux after it has traveled one full box side 
length. The optical depth at each point in the spectra is 
calculated according to equation (2), with pb(x) = 1 (the 
cosmic mean) and r(x) oc J(x), where J{x) is the UV ra- 
diation intensity at point x. The value of A is set so that 
T e ff for the spectra is equal to the PRS value of 0.28. 

In Figure [H], we show portions of five sample UVBG 
simulation spectra (the simulation box is more than twice 
the length of the spectra shown), together with a piece 
of the spectrum of Q2206-199. The model spectra rep- 
resent a universe in which the IGM is uniform density 
and absorption fluctuations are caused only by inhomo- 
geneities of the UVBG. The fluctuations are mild and have 
a large coherence scale, very different from the observed 
spectrum shown in the bottom panel. The bar shown in 
the top panel is of length 3000 km s~ x , corresponding to 
the largest wavelength for which we have tried to measure 
P(k). The fluctuations caused by UVBG inhomogeneity 
are small compared to the observed flux variations on this 
scale (a point that we will demonstrate quantitatively be- 
low). It was shown in Section 3.1 that this scale is already 
larger than the minimum scale that is affected by contin- 
uum fitting. Looking at Figure 10, it seems as though 
UVBG fluctuations will therefore not limit our ability to 
measure P(k) on these scales and below. We note that 
the assumptions employed in Figure 1^ (strongly clustered 
QSOs and an attenuation length half the expected value) 
are those that tend to maximize the fluctuations. 

We can examine the effect of UVBG fluctuations quan- 
titatively by measuring A F (fc) for the UVBG simulation 
spectra. The results are plotted in Figure [n], for Poisson 
distributed QSOs and clustered QSOs, and for the two 
different values of r at t- Reducing the attenuation length 
by a factor of two has a significant effect, raising A|(fc) 
by roughly a factor of two, while the change induced by 
clustering the QSOs is barely measurable. The value of 
A F (fc) in the UVBG simulations is 5% or less of Ap(fc) in 
the observations for the largest scale plotted, and ~ 1% 
for the largest scale reliable enough to use in our anal- 
yses, k = 2.7 x 10 _3 (km s" 1 ) -1 . On very large scales 
(A ~ 100 - 200 ft- 1 Mpc for fl = 1), the UVBG fluctua- 
tions should become important, but these scales are be- 
yond the range of the techniques we are using here. The 
results of our analysis agree with the conclusions reached 
by Fardal & Shull (1993), that the UVBG fluctuations 
have a relatively small amplitude and a large coherence 
scale. We should bear in mind that at higher redshifts, if 
QSOs are the dominant source of UVBG radiation, then 
the fluctuations should increase, perhaps to a detectable 
level, because of the observed decrease in the space density 
of QSOs past z = 3 (Warren et al. 1994) and because of 
the decreasing transparency of the IGM. 

We can look at the effects of inhomogeneity in the 
UVBG in a different way by modulating observed QSO 
spectra with additional fluctuations derived from our 
UVBG simulations. This should give us an idea of what 
occurs when density fluctuations and UVBG fluctuations 
are taken together. We have done this by taking values of 
J along lines of sight in the UVBG simulations and mul- 
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Fig. 10. — Top five panels: Lines of sight through a universe with h = 0.5, Qq = 0.2, A = 0, in which a uniform IGM is photoionized by a 
discrete population of QSOs drawn from the Haardt & Madau (1996) luminosity function. The lines represent the transmitted flux assuming 
the PRS value of r c ff, clustered QSOs, and a 50 Mpc attenuation length, r a tt- The full simulation occupies a 370 proper Mpc box, which is 
79310 km/s on a side at z = 2.5. The bottom panel displays a portion of the spectrum of Q2206-199. The bar in the top panel is of length 
3000 km s — 1 , the largest wavelength plotted in the previous P(k) figures. 



tiplying r in the observations by J/J(x), where J is the 
mean radiation intensity. The results are plotted in Figure 
12, with error bars representing the variance of the results 
for 10 separate realizations of the UVBG. There are only 
very small changes with respect to the unmodulated re- 
sults. The variance between realizations is apparent, even 
on small scales. The effects of UVBG fluctuations on small 
scales are probably due to their manifestation as changes 
in the mean optical depth, which slightly increase the vari- 
ance in the measured P(k) from spectrum to spectrum. 

Our UVBG simulations could be extended to include 
other potential characteristics of the QSO population. For 
example, it is possible that QSOs emit their Lya radiation 



in a highly beamed way or else that they have short life- 
times compared to the light travel time across r att . If 
either of these effects were operating, the effective space 
density of QSO sources responsible for the UVBG should 
be larger than that given by the luminosity function we 
have used. This increased space density would counter- 
balance any extra inhomogeneity in the radiation emitted 
by the sources themselves. 

If the sources of radiation are more numerous than 
QSOs, we expect the UVBG to be more homogeneous. 
We can make a rough estimate of the size of fluctuations 
using the model of Fardal & Shull (1993, see also Kovner 
& Rees 1989). At a certain distance from a source, r p , 
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Fig. 12. — The 3D power spectrum of the observed QSO spectra modulated by the inhomogeneous UV background taken from the 
UVBG simulations used in Figure mi (clustered QSOs, r a tt = 50 Mpc). The error bars represent the variance measured from modulating the 
observations ten times using different realizations of the simulated UVBG. The lines represent our fiducial result (see previous figures). 



the UV intensity produced by the source is equal to that 
produced by the UVBG, so that Js{ r p) = J- We define 
the effect of this local UV radiation to be strong if r < r p . 
If all sources have the same luminosity, and a space den- 
sity n, then r p = (Annr^t) 1 ^ 2 ■ The volume filling fac- 
tor, /, of regions where the effect of local UV radiation 
is strong is 4Trr p n/3. For small /, the variance of UVBG 
fluctuations is approximately equal to /. If the sources of 
UVBG radiation are starburst galaxies with mean separa- 
tions of 5 /i _1 Mpc comoving, then, using Fardal & Shull's 
value for r att at z — 2.5, we find / ~ 10~ 3 (if flo = 1). 
This is extremely small, showing that the contribution to 
the UVBG of galaxies and other sources with the same or 
greater number density will be very smooth. This smooth- 
ness will be enhanced by re-emission of UVBG radiation 
from Lya forest and Lyman limit systems, which are even 
more numerous. This smoothing effect was pointed out by 
Haardt & Madau (1996), who estimated that recombina- 
tion radiation from these systems contributes about 30% 
of the UVBG at z = 3. 

In this Section we have studied the possible effect of 
an inhomogeneous UVBG, something that has not previ- 
ously been incorporated into simulations of the Lya for- 
est. Other physical processes such as quasar outflows or 
supernova shock heating of the IGM on the outskirts of 
galaxies could also affect the clustering seen in Lya spec- 
tra, but most of these would be confined to high density 
regions with a small volume filling factor. They would 
therefore have little impact on the large scale mass clus- 
tering inferred from the Lya forest, just as shock heat- 
ing, collisional ionization, and star formation, processes 
that are included in SPH simulations but not in the PM 
approximation, have negligible impact on the recovery of 
P(k) (see CWKH). A physical effect that could have an 
impact in low density regions is inhomo geneous heating 



of th e IGM d uring helium reionization ( Miralda-Escude 



& Re?s 1994), which could produce spatial fluctuations 
in the temperature-density relation if helium reionization 



is sufficiently late and sufficiently patchy. However, if we 
consider equation (2) in the limit of weak fluctuations, we 
see that the fluctuations in temperature at fixed density 
would need to be more than twice as large as the fluctua- 
tions in density on the same spatial scale in order to have 
an equal effect, since r oc /9^ 6 T -0 ' 7 . 

In the long run, rather than trying to investigate all 
possible sources of spurious clustering, we should look for 
support for the gravitational instability interpretation of 
Lya forest fluctuations in the observational data them- 
selves. Already many aspects of the observational Lya 
forest data can be reproduced and explained by the sce- 
nario (see, e.g., Bi & Davidsen 1997; Rauch et al. 1997). 
One specific test of our approach is the measurement of 
the evolution of P(k) with redshift. The P(k) we measure 
should change in the way predicted by linear theory, keep- 
ing the same shape and increasing in amplitude in a way 
that (in detail) depends on Slo an d Ao. It seems very un- 
likely that any non-gravitational processes could precisely 
mimic this behavior, so if linear growth were seen in the 
data it would provide strong evidence for the validity of 
the P{k) measurement. Although the observational sam- 
ple we are using in this paper is too small to carry out this 
test unambiguously, we are at least able to split the sam- 
ple into two redshift halves and see if there are any gross 
deviations from linear growth. We will do this below. 

5. RESULTS 

5.1. Tabulation of P{k) and a power law fit 

In Table 1, we give the values of P(k) and their la 
errors for the seven points where we believe that our mea- 
surement is representative of its primordial value. The 
la errors were calculated in Section 3.1, from the scat- 
ter between results for 10 subsamples of the data. They 
primarily represent uncertainties in the shape of P{k), as 
there is a separate normalization uncertainty that applies 
to all points equally. This normalization uncertainty was 
estimated in Section 3.2, again from the scatter in results 



16 



between 10 subsamples of the data. The covariance matrix 
of the data values (calculated using the 10 subsamples) has 
some non-negligible off-diagonal terms, which quantitative 
evaluation of models should take into account. 

As the estimated errors on our points are fairly large, 
and we cover a limited range in k, the information in our 
P(k) measurement can be effectively summarized by the 
amplitude and slope of a power law fit to the data points. 
When determining the parameters of this fit, we can in- 
clude the effect of covariances between data points, which 
are not given in the Table above. To eliminate as much as 
possible the covariance between the fit parameters them- 
selves, we have chosen to describe the amplitude of the fit 
by the value of P(k) at a pivot wavenumber k p near the 
center of the data range. The form we fit is therefore 



P(k) = P P 



(8) 



We perform a \ 2 fit to the seven data points, including 
the full covariance matrix for the points [as in equation 
(6), though here it is the Gaussianized flux P(k) rather 
than Ap(fc) that enters]. We try several values for the 
pivot wavenumber and choose k p = 0.008(km s -1 ) -1 , the 
value for which the covariance between P p and n is min- 
imized. In evaluating the covariance matrix of the P(k) 
data points from the ten subsamples of the full data set, 
we find that the fluctuations of neighboring data points 
are usually anticorrelated, probably because of the differ- 
entiation involved in going from the ID power spectrum to 
the 3D power spectrum (equation [3]). As a consequence, 
the statistical error on the power law slope n is smaller 
than it would be if we ignored the covariances in our x 2 
evaluation. Because the anticorrelated structure of the co- 
variance matrix significantly influences the error estimate 
and the estimate of the covariance matrix from the data 
subsamples is itself noisy, we regard our estimate of the 
error on n as itself significantly uncertain. If we ignored 
covariance terms when fitting n, we would get error bars 
~ 45% larger (n = — 2.30±0.26) than those reported below 
based on using the full covariance matrix. 

Figure |l3| shows the best fit power law, together with the 
points from Table 1. Figure |lja shows contours of constant 
Ax 2 for the fit, where A% 2 = X 2 (P P , n) - x 2 {P P min, n min ) 
and (Ppmin, ^min) are the values of the fit parameters for 
which the x 2 is a minimum. The 1,2 and 3er contours of 
joint confidence in the fit parameters taken together are 
shown, corresponding to A^ 2 = 2.30, 6.17, 11.80. The 
value of x 2 at the minimum is 4.0. A value greater than 
this would be expected to occur 55% of the time given 
that we have five degrees of freedom (seven data points 
minus two free parameters). We can see that for the pivot 
wavenumber we have chosen, the errors on the slope and 
amplitude of the power law fit are effectively independent. 

The uncertainty in P p comes not from the uncertainty in 
fitting a power law to the P(k) data points but from the 
normalization uncertainty detailed in §3.2, which affects 
the level of all the data points simultaneously. The am- 
plitude of the mass power spectrum is fixed by requiring 
that spectra from the normalizing simulations reproduce 
the observed value of the amplitude diagnostic S (equation 
[7]), and the uncertainty is determined from the uncer- 
tainty in S estimated from the scatter among subsamples. 



There is an additional contribution to the normalization 
uncertainty from the uncertainty in r e ff, as illustrated in 
Figure p|. To combine the two sources of error, we con- 
struct Ax 2 distributions for each (shown by the dotted 
and dashed lines in Figure [lfja), assuming that the er- 
rors on S and r e ff are Gaussian distributed. We then 
convolve the two corresponding likelihood distributions, 



C/C n 



e Ax / 2 where C max is the maximum likeli- 



hood, and convert the convolved likelihood distribution 
into a combined A% 2 curve, shown by the thick line in Fig- 
ure |l5|a. The intersection of this curve with the horizontal 
lines at A^ 2 = 1, 4, 9 gives the la, 2a, and 3a errors 
on P p . The uncertainty coming from the normalization 
procedure dominates the overall uncertainty in P p , but 
the r e ff uncertainty makes a significant contribution and 
could easily come to dominate in the analysis of a larger 
data set. We do not include the uncertainty in the power 
law fit amplitude as a separate source of error because it 
has already been counted in the normalization error - 
the uncertainty in the overall level of the data points is 
the reason for uncertainty in the amplitude diagnostic S. 

Our final results for the power law parameters and 
their la errors are P p = 2.21^1 x 10 7 ( km s" 1 )- 3 and 
n = — 2.25lo if- The A% 2 distribution for n is shown in 
Figure |l5|b. As discussed in §3.2, if we had used a maxi- 
mum likelihood fit to normalize P(k) instead of our more 
conservative (and, we think, more robust) method based 
on the diagnostic S, we would have obtained a value of 
P p 30% (0.7a) higher and statistical uncertainties in P p 
smaller by ~ 30% (after including the T e g error) . 

Because the errors on P p and n are independent, we 
can combine their one-dimensional Ax 2 distributions into 
a two-dimensional plot by simply adding the the values 
of Ax 2 (equivalent to multiplying the likelihoods). The 
Ax 2 contours corresponding to 68%, 95%, and 99.7% con- 
fidence intervals for x 2 distributions with two degrees of 
freedom are shown in Figure |l4|b. 

Since the off-diagonal terms in the covariance matrix 
of the P(k) data points are significant, and the values 
and uncertainties of P p and n summarize the results effec- 
tively, we recommend use of the power law fit parameters 
rather than the tabulated P(k) when evaluating models. 
A quantity whose physical meaning is more intuitively ob- 
vious than P p is A 2 (fc p ), the contribution to the variance 
of density fluctuations from a logarithmic interval in k, 
given by 

' • 0) 



A^(M 



2tt 



i- J P 
2> J P 



see, e.g., Peacock fc Dodds 1994). Our results in terms 



of this quantity are A 2 (fc p ) = 0.57lo'i8 0- a errors). 

5.2. Redshift evolution of P(k) 

We now test the effect of redshift evolution using the 
high- and low-z halves of our data (see Section 2.1 for a 
description of the samples). When the linear P(k) evolves 
with redshift, it is subject to two main effects. First, there 
is the change in the amplitude of P(k) owing to linear 
growth, with P(k) increasing in proportion to the linear 
growth factor squared as z decreases. The growth factor is 
proportional to a(t) in an Einstein-de Sitter model, and in 
other models its evolution depends on the values of Slo and 
A (see, e.g., Peebles 1980). In a plot of P(k) against k, 
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Fig. 13. — The fiducial P(k) result at z = 2.5. Points plotted are those from Table 1. Also shown is the power law fit of equation (8), 
together with dotted lines showing the ±1ct uncertainty in the slope for fixed P(k p ). The error bar at lower left shows the la normalization 
uncertainty. At the la level, all of the data points can be shifted coherently up or down by this amount. 



such as Figure [if], the linear growth of P(k) would affect 
only the y-axis. However, because our observed units are 
velocities rather than comoving distances, the evolution of 
the Hubble parameter, H(z), shifts P(k) along both the 
x- and y-axes. The z dependence of H(z) is determined 
by Qq and Ao through the Friedmann equation, which can 
be rearranged to yield 

H(z) = H [no(l + zf + (l - n - A )(l + zf + Ao] 1/2 . 

(10) 

In decelerating universes the P(k) curve shifts to the 
right as z decreases because a given scale in units of co- 



moving /i _1 Mpc corresponds to a smaller scale in kms . 
Because P(k) is also in velocity units, the change of scale 
also shifts the P(k) curve downwards. These changes in 
units partially cancel the linear growth of P(k), so the 
overall measured z-evolution of P(k) is expected to be 
rather weak. We will therefore need a large observational 
data sample and a long z baseline to discriminate between 
models with different values of Q and A . With our cur- 
rent data we will restrict ourselves to the less ambitious 
goal of testing whether measurements of P(k) from the 
two different redshift subsamples are consistent with lin- 
ear growth. The relatively small sizes of our samples do 
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Fig. 14. — (a) Contours of constant Ax 2 resulting from fitting a power law (eq. [8]) to the P(k) data for the fiducial sample. The amplitude 
of P(k) at the pivot wavenumber, k p = 0.008(km s — 1 ) — l , is shown on the y-axis, and the logarithmic slope, n, on the x-axis. The best fitting 
values are marked by a cross, and the contours enclose 68%, 95% and 99.7% of the joint probability, (b) Contours xi£68%, 95%, and 99.7% 
joint probability (Ax 2 =2.30, 6.17, 11.80) after including the overall normalization uncertainty (see text and Figure 150. 
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Fig. 15. — One-dimensional A% 2 distributions resulting from a power law fit to P(k) for the fiducial sample, (a) The uncertainty in 
the amplitude P(fc p )_The dotted line shows the uncertainty from the normalization procedure— corresponding to the error bar in the lower 
left corner of Figure hcs. The dashed line shows the uncertainty from the error in T e g (Figure H). The heavy solid line shows the combined 
uncertainty, obtained Dy convolving the two likelihood distributions (assumed to be Gaussian), (b) The uncertainty in the power law slope 
n. 



not allow us to measure the growth rate, but we could 
potentially detect the consequences of a non-gravitational 
process that alters the clustering in the Lya forest. For 
example, if large scale inhomogeneous reheating of the in- 
tergalactic medium occurred during the redshift interval 
covered by our samples (z ~ 3.2 — > 1.6), and this reheat- 
ing was important enough to change Lya clustering, then 
we would not expect our estimates of P(k) for the high-z 
and low- z samples to be consistent with linear growth. 

In Figure jlg a we show the P(k) results for our fidu- 
cial (z — 2.5) sample. In this Figure, and in those in the 
rest of this section, we do not plot the largest scale point 
displayed in Figures H-fj] because it was shown in Section 
3.1 to be sensitive to continuum fitting uncertainties. Fig- 
ure |l6| a also shows the z — 2.5 linear P(k) of a spatially 
flat CDM model with Qq = 1, h = 0.5, and as — 0.6, where 
as is the amplitude of density fluctuations in 8 /i _1 Mpc 
spheres linearly extrapolated to z = 0. The model will 
be described in more detail in the next section. At the 
moment, it serves as a reference curve to which we can 
compare the results from the different z subsamples of the 
data. 

We evaluate P(k) for the high- and low-z halves of the 
data, subsamples (3) and (4) of Section 2.1, and for the 
secondary, fiber field sample, also described in Section 2.1. 
We use the same normalizing simulations that were used 
for the fiducial sample, since the P(k) shape for the sub- 
samples appears in Figure [l6] to be consistent with being 
a noisy version of the P(k) shape for the fiducial sample. 
Before normalizing, we must take into account that there 
might be an amplitude offset between the Gaussianized 
P(k) measured from the different redshift subsamples and 
the Gaussianized P(k) from the fiducial z = 2.5 sample 
used to set up the normalizing simulations. We measure 
this amplitude offset using equation (7), except that we re- 
place A|(fc) with P(k). This tells us an additional factor 



we must use in our normalization of P(k) from the differ- 
ent redshift subsamples, which in all cases turns out to be 
less than 10%. When measuring the amplitude offset, we 
must make sure that we are comparing P(k) values on the 
same scales. This entails a small rescaling of the length 
scales for the different subsamples, where we rescale k to 
the value it would have in (kms -1 ) -1 at z — 2.5 using 
equation (10). We assume Qq = 1 to do this, but our re- 
sults are not sensitive to this choice. Having done this, 
and having found the amplitude offset, we then carry out 
the normalizations using Ap(fc) and equation (7), as was 
done in Section 3.2. 

Results for the low and high redshift subsamples are 
shown in Figures |l6| b and |l6|c, and results from the sec- 
ondary, fiber field sample are shown in Figure [l6|d. The 
mean redshift is (z) =2.1 for both the low-z subsamplc 
and the secondary sample, and (z) = 2.75 for the high-z 
subsample. In every panel, the solid curve shows the lin- 
ear P(k) of the CDM model at z = 2.5, while the dashed 
curves in panels (b)-(d) show the CDM P(k) at the mean 
redshift of the sample in question. The effect of linear 
evolution is subtle because of the modest redshift range 
and the cancellation effects already mentioned. The linear 
growth in this model assumes Oo = 1, but results would 
be similar for other cosmological parameters because Q 
approaches one at high redshift in all models. 

The P(k) shape for the fiducial sample is consistent 
with, or perhaps slightly steeper than, the P(k) predicted 
by the CDM model. The P(k) shapes for the other samples 
are also consistent with the model, and hence with being 
noisier realizations of the P(k) shape measured from the 
fiducial sample. The normalization uncertainties for the 
subsamples (shown by the error bars in the bottom left 
of each panel) are significantly larger than the small P(k) 
shifts predicted by linear evolution, so we cannot achieve 
a positive detection of linear growth with this data set. 
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Fig. 16. — Redshift evolution of P(k). Panel (a) shows P(k) for our fiducial sample at z = 2.5, together with a erg = 0.6,f2q-.= 1 CDM 
model at the same redshift, for reference. In panels (b) and (c) we have split the whole sample (all the spectra plotted in Fig. hi) into two 
halves, which have the mean redshifts given in the plot labels. The ag = 0.6 CDM model is again shown, this time at z = 2.5 and at the 
appropriate z for the subsample. Panel (d) shows results from a wholly independent sample of 9 QSO spectra in a 40 arcmin AAT field. The 
mean redshift of this sample is about the same as the low-z half of the main sample, and the effective number of QSOs is also about the same 
as in the low-z half of the main sample. The different samples are described in more detail in Section 2.1. 



However, we do find that the results for the high-z and 
low-z subsamples are consistent with linear growth — in 
particular, there is no significant change in the measured 
shape of P(k) between z = 2.75 and z = 2.1. It is espe- 
cially reassuring to see that the secondary sample, which 
is wholly independent of our main sample, gives perfectly 
consistent results. 

We can quantify this consistency by carrying out power 
law fits to the power spectra derived from the subsam- 
ples, using the procedure described in Section 5.1. We 
again use equation (8), with the same pivot wavenumber, 
k p = 0.008(km s -1 ) -1 . The value of % 2 per degree of free- 
dom for the best fit power laws in these cases varies be- 
tween 2 and 4, which should only occur 7% and 0.02% of 
the time, respectively. The high \ 2 values probably indi- 
cate that our method of scaling errors to small subsamples 
from the main sample gives errors that are somewhat too 
small. In the future, with larger data sets, it will be possi- 
ble to derive the errors directly from the different redshift 
subsamples. Here, to the extent that it is possible to com- 
pare results, we find that the fiducial sample fit parame- 
ters fall within or near the formal 2 a confidence contours 
for the subsample results, assuming linear evolution (and 
Qo = 1). If we consider each parameter individually, we 
find for the (z) = 2.1 subsample, A 2 p {k p ) = O.QltlH and 
n = —1.10 ± 0.55, and for the secondary sample (also at 



(z) = 2.1), A 2 p (k p ) = 0.471^8 and n = -2- 81 ± °- 24 ( a11 
errors are la). If CIq = 1 we would expect A 2 (k p ) — 0.70 
for both of these samples, based on scaling the result for 
the fiducial sample. For the (z) — 2.75 subsample, we 
find A 2 p (k p ) = 0.35±° ;H and n = -2.90 ± 0.26, where 
A 2 {k p ) = 0.51 is expected for £1 = 1- Because the sub- 
samples are not independent of the fiducial sample, this 
comparison is not completely rigorous. We still expect, 
however, that any deviation from linear growth would have 
to be fairly small in order to escape detection. The fact 
that logarithmic slopes of different subsamples differ by 
up to 2a suggests that our \ 2 procedure may underesti- 
mate the true uncertainty in n. The agreement of A 2 (k p ) 
values at the la level suggests that our estimate of the 
normalization uncertainty is reasonably accurate. 

5.3. Comparison with theory 

We have already shown in Figure |l^ the linear theory 
P(k) for a CDM model that has roughly the correct shape 
and amplitude to match our observed P(k). In Phillips 
et al. (1998), we conduct detailed comparisons of our P(k) 
results to the predictions of COBE-normalized CDM mod- 
els, and we discuss how Lya P(k) measurements may be 
used to break degeneracies between cosmological parame- 
ters that are left by other measurements (e.g., Efstathiou 
& Bond 1998). In Weinberg et al. (1998a) we combine 
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our results with constraints from the mass function of rich 
galaxy clusters to estimate the value of Slo- In this paper, 
our m ain emphasis is on the presentation and testing of our 



observational results, so we limit our theoretical discussion 



to an illustrative and qualitative comparison between our 
measured P(k) and the predictions of a few CDM models. 

Three of the linear power spectra we compare to are 
those used in the hydrodynamic simulations for which we 
tested P(k) recovery in CWKH. All these models have 
an inflationary power spectrum with n = 1. The first is 
SCDM, "standard" CDM, a model with VL = 1, h = 0.5, 
Sib = 0.05, and as = 0.7. This value of ag is roughly con- 
sistent with (but somewhat higher than) that advocated 
by White, Efstathiou, & Frenk (1993) to match the ob- 
served masses of rich galaxy clusters, Our second model 
is identical to the first except that as = 1.2. This higher 
amplitude is consistent with the 4-year COBE data (Ben- 
nett et al. 1996), and we therefore label the model CCDM. 
The third model, OCDM, assumes an open universe with 
O = 0.4, h = 0.65, and Sl b = 0.03. This model is also 
COBE-normalized, with a 8 = 0.75 (Ratra et al. 1997). A 
ACDM model with a modest "tilt" of the primeval power 
spectrum (n p ss 0.9) would yield a similar prediction. 

The linear power spectra of these models are plotted in 
Figure [t7|, together with that of the a & = 0.6 CDM model 
already shown in Figure [H]. The measured P(k) is some- 
what steeper than that of the SCDM model and even the 
OCDM model: the points with k > 4 x 10~ 3 ( kms" 1 )" 1 
all lie below the model curves, and the points with k < 
4 x 10~ 3 ( km s -1 ) -1 all lie on or above the curves. How- 
ever, given the current statistical uncertainties, the differ- 
ence in slope is at most suggestive. Perhaps more impres- 
sive is the fact that the linear mass power spectrum, which 
has never previously been measured on these scales, has 
approximately the slope predicted by the physical model 
of inflationary fluctuations in a CDM-dominated universe. 
(Studies of galaxy clustering at z — probe the non-linear 
rather than the linear power spectrum on these scales, and 
the shape of the galaxy and mass power spectra could be 
different because of scale-dependent bias in the non-linear 
regime.) 

The amplitude of the measured P(k) is somewhat lower 
than that of the SCDM and OCDM models, though it 
is consistent with these models within the la normaliza- 
tion error (shown in the lower left). The flo = 1, h = 0.5, 
as — 0.6 model appears to have about the right amplitude. 
Since the rms mass fluctuation amplitude, a p oc \J P(k), 
is a factor of two larger in the CCDM model, and the un- 
certainty in the measured amplitude is only 18% (Section 
3.2), our results rule out the CCDM model at the ~ 5a 
level. Of course this model is already known not to be 
viable because it predicts excessively massive galaxy clus- 
ters at z = (e.g., White et al. 1993), but that failure 
reflects a combination of the high P(k) amplitude and the 
high mass density (fio = 1), both of which influence clus- 
ter masses. The present test, based on independent data 
at a different redshift, shows that the amplitude of mass 
fluctuations in the CCDM model is too high regardless of 
the value of flo ■ 

5.4. Comparison with observations of galaxy clustering 

The success of recent searches f or Lyman Break Galax- 
ies (LBGs, see Pcttini et al. 1998 for a recent review) has 



opened a new window on structure in the high redshift 
universe: the cluste r ing of star-forming ga l axies at z ~ 3 
(IStcidcl et al. 199S| ; piavalisco et al. 1998| ; |Adelberger~ei 



al. 1998| ). The mean redshifts of the LBG samples are 
close to the mean redshift of our Lya forest data. We 
can therefore compare our measurement of mass cluster- 
ing to the measurements of galaxy clustering and obtain a 
direct measurement of the bias between galaxy and mass 
fluctuations at high redshift. 

Giavalisco et al. (1998) have measured the angular clus- 
tering of a sample of 871 galaxies in a narrow redshift range 
centered on z — 3.04. By inverting the angular clustering 
using Limber's equation and the estimated redshift distri- 
bution, they obtain an estimate of the real space correla- 
tion function, £(r). Fitting their results to the power law 
form exhibited by low-z galaxies, £(r) = (r/ro)~ 7 , they 
find 7 = 1.80±g;f| and r Q = 2.lt°i h^Mpc for fl = 1 
or r = 3.3t° 7 6 /i _1 Mpc for fl = 0.2, A = 0. To com- 
pare these results with P{k) measurements, we have con- 
verted the power law fit to £(/') into a power law in P(k), 
P(k) = Ck n , using the fact that, for -2 < n < 0, 



0(tt/2)1 



-(3+n) 



(11) 



This power law fit to the P(k) of LBG clustering is plot- 
ted in Figure [l8| The Giavalisco et al. (1998) analysis only 
uses galaxy pairs with angular separations less than 330 
arcsecs, so in Figure we plot the inferred galaxy P(k) 



only out to k = 2n/ 



where 



is the comoving 



lengthscale corresponding to 330 arcsecs for the assumed 
cosmology. 

Using a counts-in-cells analysis of a sample with full 
redshift information, Adelberger et al. (1998) estimate 
a higher amplitude of LBG clustering, corresponding to 
r = 4 ± 1 fo _1 Mpc for £7 = 1 and 7 = -1.8. This anal- 
ysis uses cells of quite large comoving volume (8 h~ 1 Mpc 
cubes for Jlo = 1), which would be influenced by fluc- 
tuations on scales larger than those probed by our P(k) 
measurement. We therefore plot only the Giavalisco et al. 
(1998) results in Figure [l8|, with the proviso that they are 
likely to be slightly low in amplitude. 

The mass P(k) in Figure is from the fiducial sample, 
with a mean redshift z — 2.5. We rescale P(k) assuming 
linear growth to the redshift z = 3.04 of the LBG re- 
sults, for two different cosmological models. The redshift 
extrapolation is slightly different for the two cosmologies, 
but the primary influence of cosmological parameters is 
on the conversion from km s _1 to the comoving ft, Mpc 
units used in Figure [l8[ The parameters have a similar but 
not identical influence on the conversion of angular sepa- 
rations to comoving /i _1 Mpc, so although the mass and 
galaxy power spectra are both different in the two panels 
of Figure [ll| the offset between them is nearly the same. 

The mass distribution is significantly non-linear on these 
scales even at z = 3, but the P(k) measured from the Lya 
forest is representative of the primordial, linear P(k) for 
the reasons given in Section 2.2. From the plots, it is 
evident that the primordial P(k) is steeper than the LBG 
P(k) (with a logarithmic slope of —2.25 rather than —1.2). 
The difference in slope is caused at least partly by non- 
linear evolution, during which a transfer of power from 
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Fig. 17. — The normalized P(k) from the fiducial observational sample compared to 4 different CDM models (see text), all at z = 2.5. 
Filled circles are plotted on scales where the numerical experiments of CWKH show that the linear theory P(k) is correctly recovered. Open 
circles represent the results on smaller scales. At the la level, all points can be shifted up or down coherently by the normalization error 
shown in the lower left. 
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Fig. 18. — The normalized P(k) compared to Lyman Break Galaxy clustering (from Giavalisco et al. 1998), at z = 3.04. The solid line 
shows the LBG power law fit, and is only plotted for scales k > 27r/r max , where r max is the largest pair separation used in the computation 
of LBG clustering. The dotted lines show the effect of varying the LBG amplitude by ±lcr whilst keeping the slope fixed and also of varying 
the slope by ±1<t whilst keeping the amplitude fixed. Points show the linear mass P(k) derived from the Lya forest data, scaled to z = 3.04 
assuming linear growth. Dashed lines show the non-linear P{k) measured using a 3D FFT from the mass distribution in the normalizing 
simulations. This non-linear mass P(k) was obtained by interpolating between results from the two outputs closest to 2 = 3.04. 



large to small scales tends to make P(k) of this type shal- 
lower (see, e.g., Baugh & Efstathiou 1994). The dashed 
lines show the non-linear mass P(k) computed from the 
three-dimensional power spectrum of the mass distribu- 
tion in the normalizing simulations. We interpolate be- 
tween the two outputs closest to z — 3.04, assuming that 



the correct amplitude has been set by the Lya results at 
z = 2.5. The non-linear mass P(k) is indeed shallower 
than the primordial one, but it is still at least marginally 
steeper than the LBG P(k). 

Despite the statistical uncertainties in both sets of mea- 
surements, it is clear that the galaxy clustering is sub- 
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stantially biased with respect to the mass clustering. The 
ratio of the power spectra is ~ 3 — 10, depending on scale, 
which translates to a bias factor 6 ~ 2 — 3. If we adopt 
the larger LBG clustering amplitude found by Adelberger 
et al. (1998), then the implied bias factors are about a 
factor of two larger. The scale dependence of the bias 
implied by Figure [l^ is not unreasonable since the scales 
probed are in the non-linear regime. However, it is also 
possible that we have overestimated the steepness of the 
non-linear mass P(k) because of the finite volume of our 
normalizing simulations, or that we have incorrectly es- 
timated the slope of the LBG P(k) because the inversion 
equation ( |Tl| ) assumes that £(r) is a power law on all scales 
that contribute significantly to the measured P(k). 

Measurements of the bias factor such as this one should 
be useful in constraining theories of galaxy formation, es- 
pecially as high-z galaxy samples and Lya forest samples 
increase in size and the statistical uncertainties become 
smaller. One can ask how our direct estimate of b compares 
with those made by Giavalisco et al. (1998) and Adelberger 
et al. (1998), who set the amplitude of mass fluctuations 
by requiring that the correct masses of clusters be repro- 
duced at z = (see, e.g., White et al. 1993). Because the 
z = normalization depends on Slo and the extrapolation 
from z = to z = 3.04 depends on Slo and Ao, the value 
of b inferred in this way depends strongly on the adopted 
cosmological parameters. For an open Slo = 0.2 model, 
the L BG r esults require a relatively low b ~ 2 ( Adelberger 



et al. 1998 ). For SI = 1, a bias factor of 6 or even higher is 
required. Since the value of b inferred from Figure lies 
in between these two extremes, it seems that our measure- 
ment favors an intermediate value of Slo. This constraint 
can be derived more cleanly by comparing our mass P(k) 
at z — 2.5 directly to the combination of Slo and the P(k) 
amplitude constrained by clusters at z = 0, as discussed 
by Weinberg et al. (1998a). 

Theoretical models of the LBG population consistently 
predict strong bias between LBGs and mas s, whether they 
are based on analytic approximations (e.g., Mo fc Fukugita 
1996["~ 
al. 19 



delberger et al. 1998; Baugh et al. 1998; Coles et 



the clustering of massive halos in N-body simu- 
lations iJe^jBaglal^^ |Jing fc Sutcj 
1998f| |Wechs er et al. Tg98[ ), the combination of N-body 



simulations with semi-analytic galaxy formation models 



(Governato et al. 1998; Kauffman et al. 1998b), or full 



hydro dynamic simulations of the LBG population (Katz 



Hcrncuist, & Weinberg 1998). In detail, the predictions 



of the bias and its dependence on galaxy luminosity de- 
pend on the way that LBGs populate their parent dark 
halos, on the relation between an LBG's star formation 
rate and its mass, and on other aspects of the theory of 
galaxy formation. While many models are consistent with 
current estimates of the LBG bias (including the estimate 
presented here), more precise measurements of the bias 
from future Lya forest and LBG data should help to con- 
strain the mechanisms of galaxy formation and the nature 
of LBGs. 

If we want to compare our estimate of P(k) to the power 
spectrum of low- z galaxy samples, then our choice of back- 
ground cosmology makes a significant difference. For dif- 
ferent values of Slo an d Ao, km s _1 units at z = 2.5 
map to very different length scales at z = (follow- 



ing equation [10]). For example, the largest scale on 
which we can measure what we believe is the true P(k) is 
k = 2.7 x 10~ 3 (km s -1 ) -1 , which corresponds to a wave- 
length of 12 /i" 1 Mpc if fio = 1 but to 35 /i _1 Mpc for an 
accelerating universe with Slo = 0.2 and Ao = 0.8. The 
values of Slo an d Ao also affect the linear growth factor 
over this redshift range. 

In Figure [n] we plot the power spectrum of APM galax- 
ies, recovered from an inversion of their angular clustering 
by Baugh & Efstathiou (1993, the data values themselves 
are taken from the table in Gaztanaga & Baugh 1998). 
We also plot our linear P(k) results from the Lya for- 
est for three different sets of background cosmologies, an 
fto = 1 model ([l9|a), two flat, non-zero A models (|19|b) 
and two open models (JTsjc) . If we compare the shapes and 
amplitudes of the power spectra, we can see that the non- 
zero A cosmologies appear to prefer some antibias around 
k ~ 0.2 — 0.4 h Mpc -1 . The error bars are large enough 
that this evidence is merely suggestive at present. The 
dip below a power law seen in the APM P(k) on these 
scales has not been clearly seen in measurements of the 
power spectrum from galaxy redshift surveys (see, e.g., 
Vogeley 1998). If measurements from new, larger galaxy 
surveys confirm that it is a real feature, then we could 
try to look for it in future Lya P(k) measurements, but 
it would probably only be at an accessible range of scales 
in the case of a non-zero A Universe. We should bear in 
mind that Figure |l9| again compares the linear mass P(k) 
to the non-linear galaxy P(k). Mode coupling is likely to 
have made the non-linear P(k) significantly shallower on 
scales up to k ~ 0.1 h Mpc -1 (Baugh & Efstathiou 1994; 
Croft & Gaztanaga 1998). The cosmologies used in Figure 
19 a and ^9| c may therefore be consistent with little or no 
galaxy bias on the scales of the Lya forest measurement, 
and non-linear evolution might reconcile the shapes of the 
power spectra for the non-zero A cosmologies. 

6. SUMMARY AND DISCUSSION 

We have presented an estimate of the primordial power 
spectrum of mass fluctuations, P(k). To arrive at this 
estimate, we have applied the reconstruction method of 
CWKH to a set of 19 QSO spectra, originally measured for 
other purposes. The method assumes that the primordial 
density fluctuations are Gaussian, as predicted by infla- 
tion, and that Lya forest absorption arises predominantly 
in the diffuse, photoionized IGM, as predicted by hydro- 
dynamic cosmological simulations. This physical picture 
of the Lya forest generically predicts a simple, non-linear 
relation between the mass overdensity and the Lya optical 
depth. The one uncertain parameter in this relation (the 
constant A of equation [2]) can be fixed observationally by 
matching the mean opacity of the forest (r e ff), yielding a 
P(k) measurement with no unknown "bias factors." 

Our measurement of P(k) is made at a mean redshift 
z = 2.5 and spans scales from k — 1.4 x 10~ 2 — 2.7 x 
10 _3 (km s -1 ) -1 , which correspond to wavelengths of 2— 12 
comoving ft, _1 Mpc if Slo = 1- Fitting a power law to the 
data points, we find a logarithmic slope n — —2.25 ± 0.18. 
The amplitude expressed in terms of the variance in 
the density field per unit interval in In A; is A^(k p ) = 

fc 3 P(fc p )/27r 2 = 0.57±g;ff. Here P p is the value of P{k) at 

a pivot wavenumber k p — 8 x 10 _3 (km s^ 1 ) -1 , chosen so 



23 



Lya, Q o =0.2, A Q =0 
Lya, n„=0.4, A„=0 




0.1 1 
k (h _1 Mpc) _1 



0.1 1 
k (h _1 Mpc) _1 



0.1 1 
k (h _1 Mpc) _1 



Fig. 19. — The normalized P(k) compared to the APM P(k) inverted from angular clustering (Baugh & Efstathiou 1993) at z = 0. We 
show results obtained after using five different cosmologies to do the extrapolation from z = 2.5 to z = 0. Note that these plots compare 
a linear mass power spectrum to a non-linear galaxy power spectrum. The differences between them illustrate the advantages of using our 
measurement of the primordial P(k) to test theories directly, despite its large statistical uncertainty. If any one of the background cosmologies 
in these three panels is the correct one, then it is likely that both non-linear matter evolution and galaxy formation physics must be invoked 
in order to explain the galaxy power spectrum results. 



that the statistical errors in n and P{k p ) are uncorrelated. 
The uncertainty in the amplitude of P(k p ) corresponds to 
a lcr uncertainty of about 18% in the rms amplitude of 
mass fluctuations on the scale 27r/fc p ~ 700 km s . This 
error estimate includes the statistical uncertainty in r e ff 
quoted by PRS; the impact of alternative determinations 
of r e ff can be found from Figure |[ 

There are a number of reasons for thinking that this 
measurement of P(k) is robust, in the sense that any sys- 
tematic errors are no larger than our lcr statistical errors. 
First, the tests of CWKH show that our method success- 
fully recovers the true linear P(k) from full hydrodynamic 
simulations of three different cosmological models, even 
using artificial spectra of moderate resolution and signal- 
to-noise ratio. Second, we have examined (in Figures |^ 
and ||) the effects of changing the operational parameters 
used in our data preparation procedure. We find that con- 
tinuum fitting uncertainties set an upper limit to the scale 
on which we can measure P(k), at k ~ 2xl0 _3 (km s~ 
On smaller scales, reasonable variations on our standard 
procedure do not influence our results at the la level. 
Third, we have examined (in Section 4) the most obvi- 
ous potential source of "spurious" fluctuations in the Lya 
forest, spatial variations in the UV background intensity, 
and shown that they should have negligible impact on P(k) 
on the scales accessible with our current data. Fourth, the 
P(k) determined separately from the low redshift and high 
rcdshift subsamples of the data are consistent with the hy- 
pothesis of a single underlying power spectrum experienc- 
ing linear growth from z = 2.75 to z = 2.1, albeit with 
large statistical uncertainties owing to the smaller size of 
the subsamples (Figure 16). Fifth, the P(k) determined 



from the low- 2; subsample is consistent with the P(k) de- 
termined from an entirely independent set of nine QSO 
spectra (the secondary sample described in Section 2.1) 
with the same mean redshift (Figure O). Our P(k) mea- 



surement also agrees well with the measurement presented 
in CWKH from Songaila & Cowie's (1996) Keck HIRES 
spectrum of Q1422+231, which has a mean absorption 
rcdshift z = 3.2. The statistical precision of the present 
measurement is much higher than that of the Q1422+231 
measurement because of the greater number of QSOs that 
contribute to it. 

There are two general ways that our P(k) measurement 
can be checked using existing or readily obtainable Lya 
forest data. The first is simply to confirm the result with 
independent data sets, preferably ones that have larger 
numbers of QSO spectra and hence yield smaller statis- 
tical error bars. With a larger data set, one could also 
carry out a more exactin g ve rsion of the redshift evolution 
test illustrated in Figure [16| Our present data are consis- 
tent with linear growth of P(k), but positive detection of 
the expected growth (not possible with our current statis- 
tical errors) would be a strong empirical indication that 
the P{k) derived from the Lya forest indeed represents 
fluctuations in an evolving mass density field. The second 
general approach is to test the predictions of a model with 
Gaussian initial conditions and our derived P(k) against 
other statistical properties of the Lya forest, using high 
resolution spectra. For example, the flux decrement dis- 
tribution function ( Rauch ct al. 1997] ) depends mainly on 
the amplitude of P(k) and on the PDF (Gaussian vs. non- 
Gaussian) of the primordial fluctuations (Weinberg et al., 
in preparation). This statistic can therefore be used to 
test our P(k) determination and to test the theoretical as- 
sumption that is critical to our method, the hypothesis of 
Gaussian initial conditions. At a greater level of detail, 
one can check that spatial variations in statistical proper- 
ties of the forest (from QSO to QSO or within individual 
spectra) are consistent with expectations, to constrain any 
coherent spatial fluctuations in the temperature-density 
relation. Finally, larger samples of high resolution spectra 
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can be used to measure r e ff (as in R.auch et al. 1997 ) , bet- 
ter constraining the observational parameter used in our 
P(k) normalization. 

Comparison of our derived P(k) to the measured clus- 
tering of Lyman Break Galaxies implies that the latter are 
a highly biased population, with a bias factor 6^2 — 5. 
While the statistical errors in both the Lya P(k) and the 
LBG P(k) are presently large, this is arguably the most 
direct measurement of bias between galaxies and mass to 
date. The bias factors inferred fro m comparisons of galaxy 



density and peculiar velocity fields (Strauss & Willick 1995 



and refer ences therein) o r from redshift-space distortion 
analyses (Hamilton 1998 and references therein) depend 
strongly on the assumed value of flo, roughly 6 oc f^' 6 . 
The bias measurement presented here is only weakly de- 
pendent on cosmological parameters, as one can see by 
comparing Figures [lq a and |i~8|b. The comparison be- 
tween our derived P[k) and the power spectrum of present 
day galaxies does depend on cosmology (Figure 19), since 
the values of Oo and Ao affect the amount of fluctuation 
growth and the change in velocity scales over the large 
redshift interval from z = 2.5 to z — 0. 

The slope and amplitude of the derived P{k) are consis- 
tent with the predictions of some scale-invariant, COBE- 
normalizcd CDM models (e.g., the OCDM model in Fig- 
ure [l?], with f2o = 0.4, h — 0.65, erg = 0.75) and inconsis- 
tent with others (e.g., the CCDM model, with £lo = 1, 
h = 0.5, erg = 1-2). As we show in a separate pa- 
per (Phillips et al. 1998), COBE- normalized CDM models 
with a variety of f^o and h values can fit the Lya P(k) if 
the primordial spectral index n p is treated as a free pa- 
rameter, but within any given class of models (e.g., open 
CDM) one obtains a constraint on a parameter combina- 
tion of the form fi /i a n^. In Weinberg et al. (1998a) 
we show that consistency between our P(k) derived at 
z — 2.5 and constraints from the cluster mass function at 
z — require a low value of Slo (^o ~ 0.45 for Ao = 
and fio ~ 0.35 for Ao = 1 — f2o) if the power spectrum has 
the large scale shape implied by studies of galaxy cluster- 
ing. Perhaps the most significant theoretical implication of 
our results, already evident in Figure [Tt], is that inflation 
+ CDM models, originally motivated by considerations of 
microwave background anisotropies at z ~ 1000 and large 
scale structure at z ~ 0, predict a P(k) that is at least 
roughly consistent with our measurement, even though it 
probes a different epoch of cosmic history and is based 
on a complex analysis of entirely different observational 



phenomena. 

The main requirement for improving the precision of 
our P{k) measurement is the analysis of a larger sam- 
ple of QSO spectra. With 100 full Lya forest spectra, 
it should be possible to reduce the statistical uncertainty 
in the amplitude of P(k) below 10%, provided that T e g is 
determined with sufficient precision from high resolution 
spectra. Greater statistical precision will merit a more 
detailed examination of some potential systematic errors, 
and it will also be worth testing continuum fitting proce- 
dures on large simulations to see if the P(k) determina- 
tion can be extended to larger scales. In the slightly more 
distant future, analysis of spectra towards pairs or close 
multiples of QSOs can be used to measure redshift-space 
distortions of clustering and thereby constrain spacetimc 
geometry, as proposed by CWKH, Hui, Stebbins, & Buries 
(1998), and McDonald & Miralda-Escude (1998). Eventu- 
ally, the giant samples of QSO spectra from the 2dF and 
Sloan redshift surveys may yield a truly three-dimensional 
view of evolving large scale structure in the intergalactic 
medium. The moderate spectral resolution of these sam- 
ples (~ 8A and ~ 2.5A, respectively) is not in itself an 
obstacle to such a program: as our results here show, by 
treating each spectrum as a continuous map instead of 
a collection of lines, one can measure large scale fluctua- 
tions without resolving small scale features. The impor- 
tant question will be whether the unabsorbed continuum 
can be determined with sufficient accuracy from such data 
over scales larger than the typical transverse separations 
of QSO lines of sight. 

The power of the Lya forest as a test of cosmological 
theories derives from the simplicity of the physics that 
governs the absorbing medium and from the existence of 
an observable quantity, r c ff, that calibrates the relation 
between underlying mass fluctuations and the observable 
fluctuations in QSO flux. The results presented in this 
paper illustrate the promise of studies that use the Lya 
forest to trace the formation of structure in the Universe. 
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Table 1 

The linear P(k) at z = 2.5. We give the wavenumber k, P(k), and the la error on P(k). An additional error 

SHOULD ALSO BE ASSIGNED TO THE NORMALIZATION OF ALL POINTS, WHICH IS +40%, —29% IN P(k) (la). 



k (km s 1 ) 


-l 


P(k) ( kms" 1 )" 3 


a[P(k)] 


2.66 x 10" 


-3 


3.6 x 10 8 


1.9 x 10 8 


3.52 x 10" 


-3 


1.7 x 10 8 


8.9 x 10 7 


4.65 x 10" 


-3 


9.4 x 10 7 


2.8 x 10 7 


6.14 x 10" 
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3.4 x 10 7 


1.0 x 10 7 


8.12 x 10" 
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1.7 x 10 7 


5.1 x 10 6 


1.07 x 10" 


-2 


1.3 x 10 7 


3.8 x 10 6 


1.42 x 10" 
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6.2 x 10 6 


1.1 x 10 6 



